This week on #Growth, host Matt Bilotti is talking all about data – space data, agriculture data, data science – with the self-proclaimed “data queen” herself, Lauren Moores, VP Data Strategy and Data Sciences at Indigo.
Lauren has spent her whole career chasing data and finding answers to the toughest questions for industries that have undergone significant transformation. Matt and Lauren talk about the evolution of data science and how much it’s all changed in just 20 years.
Lauren also shares her pro tips on how to structure data teams through periods of massive growth.
Subscribe & Tune In
Matt Bilotti: All right, hello and welcome to another episode of #Growth. I am Matt [Pilati 00:00:14], and I am super excited to have a guest today who, a couple of weeks ago, sent a message in Slack, DC sent a message in Slack and said, “Someone has to talk to this person on a podcast. They are amazing.” And today I have Lauren Moores, who is a self-described data queen, and she currently is VP of data and strategy at Indigo, and I’m super excited to have her. Thanks for joining.
Lauren Moores: Thanks for having me. I’m excited too.
Matt: Yeah, so I was reading a little bit about you. Lauren says she’s got more than 20 years in data technology, strategy, science, data creation and various information and tech industries, and even a PhD in economics from Brown, which is pretty cool, and has worked across a lot of different industries building data teams and applying data to answer questions among your teams. Maybe you could just give a quick background on your journey and a little bit about Indigo for people that aren’t familiar.
Lauren: Yeah, let me start with Indigo. So Indigo’s the leader in beneficial agriculture. We started 2014, and our mission is to help farmers sustainably feed the planet. It all started with the science side, using microbials, and the microbiome system, to … like we did with human gut, apply it to the plant. And since then, we’ve evolved into providing services and products to farmers to help them be more profitable and also to be more sustainable.
I got to Indigo luckily through a former colleague that DC knows well, and I feel like for me in my journey, I chase the data. And I was working in microfinance at the time. New data in terms of credit and using mobile for making credit decisions, but not new technology or data in terms of creating analytics or whatnot. And I’ve always wanted to get involved with space data. I never thought I’d be involved in agriculture, but it’s been amazing.
My whole career I’ve chased data. And what does that mean? Essentially, I’ve had the luxury of being in industries where data is changing exactly what we’re doing, whether that was originally time series in mainframe when I first got out of school, to working in the B2B space, and creating the first information portal for companies to use for first appointments, and ultimately to compete; we were using clickstream before mobile came and kind of disrupted that whole space if you didn’t have a mobile panel.
Matt: Yeah, that’s interesting. So you’re chasing data, and I think the way that you describe looking for places and problems where data is fundamentally changing the way that that thing is being structured. What does that … I guess that looks like agriculture, that looks like [crosstalk 00:03:08].
Lauren: It’s advertising technology, it’s finance technology, information services, and content, it’s … I first started out doing predictive analytics for industry, so what were oil prices going to do, right? What was paper prices going to do? And that was using time series data in a mainframe that had been built, and people weren’t using it … I remember, actually it’s when the Compaq PCs came out where you actually could bring it on the airplane. I have a really good story about that. My friend took the computer. She was going on a trip to visit a client. And the guy next to her said, “What is that? Your sewing machine?”
Matt: Now everyone takes their computer on the plane and just has to turn it off.
Lauren: And it’s not as big as a suitcase.
Matt: Yeah, that’s funny. So you joined Indigo specifically because super interesting data set you’re working with: satellite data, you’re working with supply and demand data with the marketplace … Can you talk a little bit about how … You’re moving into a completely new industry. How do you think about getting a footing, like what does that look like? Is it the same exact systems that you had to use before? Are you evolving a lot of those pieces in a new industry, new setting?
Lauren: Yeah, that’s a really good question. I find that companies are in various states with their data, and usually it’s siloed. Even if you come into a company that has been around for five, 10 years, if they’re not specifically focused on creating a data foundation, then they might have things in non digital form, where they’re stuck in somebody’s hard drive, or maybe they are digital, but they’re not in a database for access. So I feel like at Indigo, we had the data, we just didn’t have access to it per se. In addition too, we were building out our access to the field data through our agronomy team, working directly with our growers, so getting access to the field information and the crop management practices, which makes a huge difference for us to be able to provide analytics and crop management advice back through our agronomists.
Matt: Okay, so show up, the data’s there, it’s just all over the place, like-
Lauren: Some of it.
Matt: Some of it.
Matt: Some of it, and so you have different teams that are owning different parts of the data. Is the first move just trying to get it all in place, like what does that even look like when you move into a new company and you’re just … You’re in charge of the data now. Where do you go from here?
Lauren: Ultimately, it’s picking off the things that you need to pick off to solve the problems right at hand. As much as you need to build the foundation, it’s that classic issue where, “Oh, wait, we have to build this product and it’s due in three weeks. Oh, but I need to build the foundation, otherwise there’s going to be techdat. What do I do? And I don’t have enough resource to actually take a team and say, ‘Hey, you handle techdat or build the foundation, and you build the new products, because we need to push something out.'”
You’re constantly making compromises, and I’d say that the last year, we’ve made progress, and we have a data platform now, which is amazing, and it’s actually being run by a former colleague who has been able to bring together everything that we were trying to build out previous to his in the team build-out, where we were still dealing with siloed data. If I have to deliver something, then I’m going to figure out how to get the data as quick as possible. It might not be in the most accessible form for everyone, but knowing that I can use it in terms of the data sciences team, and all the other teams within my department, that at least helps us push things out the door.
Matt: Got it. We’re also just starting to build out a data team, so you’ve been with Indigo a little over a year now. [Drifter’s 00:07:08] just starting to build the data team. How do you think about how your team is structured? Is it in service to all the other … are your customers the other teams, or are you self-operating, and then bringing that data to the teams to say, “Hey, here’s how to help inform these decisions.”? Are you finding stuff on your own, and bringing that out, or are people … or is maybe … It’s probably both.
Lauren: It’s both. It’s definitely both, particularly if we’re trying to work on a model or build out different algorithms and analytics, and we realize, “Hey, we don’t have this third party dataset, or we need to ingest something that nobody else has thought about. Then we’re going to work on that in parallel to building out whatever product there is.” At the same time is working with essential data engineering architecture, and we also have a data science engineering that we work with, that allows to not have to focus on actually building out the foundation, but figuring out the requirements. So requirements are, I need to have access to USDA data through Snowflake or whatever. Or I need to look at our field data by grower, farm, field, and so let’s build out that taxonomy.
So we do both. Not only are we clients, in terms of what we need from the rest of tech, but the rest of the company’s our client, and whatever they need, we need to figure out how to solve it. Sometimes I’m not going to solve it, but we’ll figure out who does.
Matt: Got it. When you first joined, how many people were working on the data team?
Lauren: That’s a good question. There were seven on my team, and they were data scientists-ish only. And there were a couple of people dedicated on the eng team to data, and since then, we’ve grown five times as a company. My team’s grown five times. And we’ve also created delineation between different groups. There was somebody in Memphis who was working on field data, but now they’re part of my overall team. And we have a flow, so I like to think about it in terms of a data strategy team all the way down to data product. And essentially, the whole focus is how do we work together as a group to figure out what data solutions do we need in order to collect more data, or to make sure that people are getting what they need, down to what’s the best way to surface that information, and how would we maybe derive our data science differently or create actual insights so that it can be shown in the data product side?
In between all there is field solutions, which is all the field data coming in, so you think about all the machine data that comes off of one million plus fields that we have treated seed on, plus we also get information from the non-treated seed. We a data management team which has to … not just looking at the commercial side and the field side, we’re working also on all the R&D side. So all the genetic … you know, genomes to phenotypes to everything that we need to figure out for as we move our microbials through the stage … we have a stage gate process, whether or not they’re going to go to market or not.
All that has to be handled in data management, and then I also have an operations research team, so that can optimize our systems.
Matt: Got it. There is a lot of stuff to work on there. You worked across a lot of different industries.
Matt: And built teams in all of those. When you show up to a place like Indigo, and you’re starting to build up this data team, is there a set of core data principles that are consistent everywhere that you’ve worked, or does it really morph to the industry? Has it changed a lot, or are you seeing that there are a couple of pieces that are just core to … no matter what, this how a data team has to operate?
Lauren: Yeah, it’s core. Essentially, what do we have? Can we access it? Is it scalable? And what are we going to do with it? All those things. And as you can imagine, each one of those is fraught with non-truth, or there’s no ground truth, right? So what do we have? Oh wait, I didn’t realize we had that. That’s amazing. Is it accessible? Who’s been using it? Okay, great. Or we built … Not only is the data side, but then what are all the derived products that you build off the data? Let’s make sure that whatever product that we’re building, or whatever models that we’re building, that there’s a consistent way of being able to reproduce that, and it’s not just stuck in somebody’s Excel spreadsheet that was 100 rows that was used to produce some sort of result. We need to make sure that we are consistent, we’re robust, that we can always show exactly what we did so that the whole transparency and the flow of where the data goes … That’s consistent no matter where I’ve been.
Scale too. Where okay, let’s start off with … I’m going to start with MySQL, or let me go to Redshift, or none of this is working; it’s going to take me overnight to pull any data that I need in order to answer the question. So it’s really important for me, and I’ve managed these teams before but at Indigo I don’t need to, which is wonderful, to have the core engineering to be able to scale out exactly what you need to do.
Matt: Yeah. At what point … So Indigo is about 700 or so …
Matt: 750 people.
Matt: At what point is, and I know there’s a lot of people out there wondering, “Do I need a data team? Is it time for us to start building this out?” And what is the point at which you should one, start properly tracking stuff, two, get it in a centralized place? When is too early, and when is … I guess you know it’s too late when no one knows the answer to stuff, but around what time should companies start to think about this?
Lauren: Well, that’s a really interesting question. I feel like you need to have somebody on your team, even if you’re two or three people, that understands data.
Matt: At any point.
Lauren: At any point. Because you have to understand what’s going to be used to run the business. Your business data that you’re going to need to show to possible investors, or you’re going to have … whatever you need to, to build your product. Now, maybe it’s oh, build the product, build the MVP, figure that out first and then acquire a data team? Possibly. In our expansion in countries, the way that we’ve worked is we are building out data teams slowly, because we have the core though. So there is the core global in Boston, and then I have my indirects lead one person. But that will build out as each country grows.
But you need some access. You need someone who’s thinking about it because essentially, how many times … This has probably happened to you, where you have a request. You’re building out a feature for the product, and all of a sudden, you realize, “Oh my God. I don’t have the data I need to do this.” Or, “We can’t measure this. I don’t know how good this is going to be.” Or, “Wait, I can’t report out on it.” So it’s not just about creating derived data solutions, it’s about how do you go to market and be able to show that what you did is working. And even with your commercial …
Matt: I will say that we’ve run into that a few times. And even to the point where we thought we had it all set up and then we build the thing, and then we’re like, “All right, now it’s time to go measure how the thing worked.” And then we find out that something wasn’t tracking properly, or it’s structured in a tough way. I think it’s really interesting, because what you’re saying is, no matter what point, even if you’re just talking about data around your revenue, not necessarily your product or your service, that kind of data, you have to have someone that understands that well.
Lauren: That’s right.
Matt: And can build some systems round that.
Lauren: But I’m also biased. I’ve been living and breathing this since I graduated college, and to me, it’s … Sometimes when you’re in a data and tech space, you don’t always remember that people don’t live and breathe this, or it’s not the way they’re thinking all the time. I’m constantly thinking about what data is there, what’s the gap, how could we use it differently? Oh gosh, this just came in. How do I make sure that we take advantage of it in the next six months if I can’t do it now? Or what … Does the system allow me to get to what I want? And if not, we’re going to have to build something now, so that I can do something that I want to do a year from now.
Matt: Yeah, yeah. Okay, so I got a question for you, and I may get a biased answer. How is going too far with data?
Lauren: That’s a great question.
Matt: Because I feel like some companies can wind up in the situation where they’re like, “We can’t answer anything until we know exactly what’s going on with the data.” How far is too far?
Lauren: There’s two answers to that. You’ll never get 100% data. So you have to understand-
Matt: Even you? You’ve been chasing it for all your life.
Lauren: [crosstalk 00:16:17]. I’ve been chasing it. I’ll never get 100%. So you learn when to make your decisions on 80%. Sometimes you have to make a decision on 40%. You always have to think about what are you trying to answer, and how quickly do you need to do it? If your data decision is going to impact revenue or strategy, then you do as best as possible, and you’re as transparent as possible about what you have and what you don’t have.
The other answer is … This used to happen a couple of years ago, where we would have clients who said, “Oh, just send me lots of data. I want big data. I’m going to be a big data owner.” So what? I have people say, “Well, we need to architect it so that we have all this data.” Well, what are you going to use it for?
Matt: Why are you …
Lauren: Why are you collecting it?
Matt: [crosstalk 00:17:09], yeah.
Lauren: So if you have one trillion events, but you have no idea what they are or what you might use or build features for, then it’s useless. You need smart data. I’ve always said it. You need smart data. I’d much rather have smart data and a smaller amount to be able to make decisions, than just say, “Hey, dump everything in.” For instance, ad tech: really cool data in ad tech. I am thrilled with being able to spend a few years there, because the data and speed and the decisioning that needed to happen, just pushed in terms of the systems that had to be built and the algorithms that you needed to build in order to respond.
But people got caught up into just collecting everything, and well, are you using every piece of the advertising bid, or are just using features of it, pieces of it? Are you just using IP and lat long and maybe something else? Well then, let’s focus on getting that data in, so that we can be really good about what we’re doing, rather than worrying about the fact that we’re just building out this huge database that nobody’s ever going to have access to.
Matt: Yeah. It almost becomes this game of hoarding. I’m always thinking about … Recently I was watching Netflix, Marie Kondo where she shows up and helps people organize their homes, where they have all this stuff that they bought, and all these clothes that they never wear. It’s kind of like, you have all this data that you spend all this time make sure that you have it perfectly, but you’re never going to use ’cause you’re not even thinking about how it might apply to a problem set in some way, shape or form.
Lauren: Exactly. It’s one of those jokes that you could … I picture a cartoonist saying, “Well, do you have big data? Are you in the cloud? Great, then you’re successful.”
Matt: Yeah. Check, check. Exactly. It’s funny. Okay, so I think there are probably two core groups of listeners. There’s one where you’re early on, you’re just building … you’re getting started building your company, and you’re wondering, “Do I need to just start dumping all my data into one place? Should I have my engineers go make sure that they spend all this time building these extra systems? How should I be thinking about data in my company in those early stages in terms of my resourcing?”
Lauren: I would say don’t overthink it, don’t over-build. As much as you … When you’re starting out early, you really don’t know where you’re heading, so you can’t really scale for something you don’t know yet. It’s fine to have things in Excel. What you need to do is make sure that you have … You know where the data is, you know who has access to it, you have quality control on it, and you know how you’re going to get it into a digital form so that you can use it for other things. And if you have that process in place, that’s fine, because you don’t want to over-build a huge MySQL structure that … “Oh, I’m going to connect this table to that table, and we have 10 things, and …” ‘Cause it can get over-complicated, and then you’re creating havoc for yourself. It’s fine to just use Excel. Similar too, it’s fine to just use heuristics if you can’t build true models.
Start out simple, and then take it from there. But think about when you’re making the decision, whether it’s through architecture or through who you’re hiring … Don’t back yourself into a corner. Try not to. We all do; we’ve all made decisions where oh, God, we have to completely revamp how we thought about that platform, or we need to get access to something that we never thought we had before. But early on, get somebody who’s thinking data, who knows where it is, but don’t over-build.
Matt: Okay. And let’s say, for the other group, either you are on the executive team or founding team of a company that’s growing rapidly, and you’re getting to a point where your teams are starting to struggle making decisions and you’re not quite seeing the data to back all the things that are happening amongst your teams, or you’re working at a team and you’re a person and you’re trying to get access to data but you’re not sure where to go or who to go to, what should people be doing in that sort of situation? Where do they get rolling if they don’t have someone like you that’s already been building out these systems?
Lauren: Get somebody or hire somebody who can do a audit of what you have and where it is. That’s the best start. And we’ve done it, and we’ve done it both on the business side, so you’re coming from the top down to the data and tech side from the bottom up. And I think that’s important actually if you can do both, because I’m going to bring a certain bias, and I’m always thinking about okay, what is the data, where is it, what does it look like, what’s its type, how much do we have, who authored it? Things like that, whereas on the business side, you’re thinking more about how do you run the company with it. And you can have one person do that, but not always. If you don’t have anything, then start there. And then start thinking about is it already flowing into products? Do you want to start using that data differently into new products? So then what types of systems do you need to build in order to get there?
Now, depending on your industry, I’ve worked in industries or nonprofits specifically, where you are using third party. You’re using a third party data platform. You send them the data, they manage it for you, they send back reporting and whatnot. Sometimes that’s what you have to do. I hate that, because I want to have access to my data right away.
Matt: You want to control it.
Lauren: And AWS makes it really easy to have a data system, and have access to your data wherever you are. Or Google, any one of those, and there’s other proprietary systems that are definitely worth buying, but it’s easy to set up a data system. It’s hard to know what you need to do with it if you don’t understand your business. Also it’s hard to know how to scale if you really don’t know what you’re going to be doing five years from now. Although none of us what we’re doing five years from now, so it’s really about, all right, can I … and I’m always thinking, can I do what I need to do … 12 months. And then continually … Every day, that changes. Can I do what I need to do? And sometimes, no, I don’t have access to that, and we have to build something. Or hey, paper works, and if we have to bring paper in and digest paper, and translate it in order to use it, then we’re going to do it.
Matt: And so you’re almost not only trying to figure out what are the questions that you’re answering now, but having some anticipation of what are we going to answering in six months, and as we’re answering this question now, how do we make sure that it’s then accessible again six months from now and 12 months from now?
Matt: Yeah. Great. Well, Lauren, thank you so much for coming by. When DC said we had to talk to Lauren, it was definitely no joke. So thanks for coming. Really appreciate it.
Lauren: Thanks for inviting me. It was great.
Lauren: Yeah, thanks Matt.
Matt: Cool. All right, well, I’ll catch you next time. Thanks for listening to #Growth, and if you have any questions, feedback, anything like that, my email’s just email@example.com. Feel free to send a note. I’d love to hear from you, and I will talk to you next time. Bye.