The #1 Thing You Need to Make Your Experiments Successful

Growth by Seeking Wisdom podcast

What’s most important when running any growth experiment? Establishing a control group! Sounds simple, but if you’ve ever conducted any kind of experiment, you know it’s easy to mess up. After all, without a control, you have no way of knowing if your experiment actually made an impact.

In this episode, host Matt Bilotti runs through how to set control groups and why they matter. Plus how to accurately measure your experiments without getting lost in a web of intertwining growth hacks.

You can get #Growth on Apple PodcastsSoundCloudSpotifyStitcher or wherever you get your podcasts. Or listen to the full audio version below ?

Like this episode? Be sure to leave a ⭐️⭐️⭐️⭐️⭐️⭐️ review and share the pod with your friends! You can connect with Matt Bilotti on Twitter @MattBilotti.

Subscribe & Tune In

Apple Podcasts Spotify SoundCloud

In This Episode

0:15 – Why control groups matter
1:23 – Definition of a control group
2:24 – Matt’s experience of falling into the trap of not having a control group
3:24 – Try to stay away from layering experiments on top of each other
4:04 – Control groups have to be during the same time period to control for variables
4:11 – Conducting a control experiment is harder than it sounds
5:21 – How do you conduct the experiment? Keeping in mind what your company already has at its disposal will help you determine if you need to pay for a service or build it yourself
5:33 – What Drift did
6:45 – Track everything
7:29 – Define what success looks like

Full Transcript

Matt Bilotti: How is it going? Welcome to another episode of #Growth here on Seeking Wisdom. Today I’m super excited to talk about why a control group matters. This is for all of you people out there that are running growth experiments or doing any kind of experimentation in your product or service and want to know is that thing working. I think it’s an incredibly important thing that I very much discounted when I moved over into growth.

At the beginning I was like, “Nay, we don’t need control groups. That’s super stodgy and it’s going to slow us down and it’s going to be a total pain.” Boy, I very much learned how incredibly important they are if you actually want to measure success of something working or not working. This was probably the very first lesson that I learned when I moved into growth. I read all the blog posts out there about experimentation and all that, and some people talked a little bit about control groups, and I think what was missing was how critical they really are. So, I want to just define a control group.

Let’s think about being a kid at a science fair. I actually use this example in another episode. If you’re trying to prove that black cloth is hotter in the sun, what would be really bad is to measure the temperature of a thermometer yesterday and then wrap it in cloth today and measure it and say that you have evidence to say that it was hotter today. The thing is, it’s not going to give you good results because you’re missing the impact of a gigantic variable, which is the weather. I know that this sounds probably really primitive and kind of silly to be talking about such a basic thing from a concept of growth, but it’s so, so easy to get into a mindset where you say, “All right, we’re going to do growth stuff. We’re going to start to experiment. We’re going to really see if we can drive certain numbers,” and then fall into this trap. Let me tell you exactly how we landed in this trap.

We wanted to start sending activation emails. If someone signs up, we didn’t have an email that’s sent to them when they signed up, so we started sending emails. We were like, all right, this is an experiment that we’re going to run. We’re going to turn on these emails and see how the emails performed. Once we got through that and they were running for a while, we realized that we didn’t set a control, so we don’t actually know if it was the email that increased someone’s likelihood to activate and use the product. Maybe the marketing team started changing the homepage and the homepage has a new pitch and people connected to that pitch better, and now they’re more likely to actually get started once they sign up. You’re missing all the other context of what’s happening, and the only way to know if your thing worked is to take a group during the same time period that you’re making a change and do nothing with that group.

So start there. Over time, you can start to layer experiments on top of one another. Highly recommended to pretend that it’s not even a thing to begin with because it’s really easy to create this intense, insane web of experiments, and you could just wind up at a point where you can’t measure anything, spent all this time and energy and you just simply can’t understand the changes that you made in isolation because you had so many other variables going on at the same time. So super, super important, the control groups have to be during the same time period, again, because you don’t know if something else is the thing that impacted it.

The really interesting thing about this that I’ve learned now that I’ve gotten further into growth is it’s really simple sounding to have a control experiment. In practice, it is very complicated because you basically have to have some kind of infrastructure to pick a random sampling of a group of people to make a change for or to not make a change for. One way that I was thinking of hacking this early on was, oh, well, I can just take … so we’re running the experiment. We’ll turn on this new feature for a certain group of people. I’ll just go into the database and find everyone whose names starts with the letter M and then go ahead and turn it on for them. But that’s really bad because then you just create this world where you don’t have a source of truth of who is getting what and who should be getting what and when they should be seeing which things, and then as you try to roll out more experiments, it just becomes a huge headache. Trust me, I’ve been there.

So, the question then is, well, how do you do this? The age-old question it comes back to is do you build this yourself or do you buy or pay for some service that does it for you? We looked at a few services, and ultimately we’ve decided to build it out ourselves. One of the core reasons that we’re doing that is because we already had a gating system in our product. What is a gating system? A gating system is a way to basically say “We’re going to give all these accounts this new feature. We’re going to roll out a new feature. We’re going to flag it so that only or take a group from the past and give them access to it.”

So, we had a gating system, and because we had a gating system, a lot of the infrastructure to choose who gets what when. That was there, and a lot of these A/B testing tools out there for product changes, one of the core features is gating systems. If you don’t have a gating system, first of all, you should totally get one. It’s really important for rolling out things in your product. If you do have a gating system, then look at it to say, how can we start to use this to break out random samplings of our user base to give experiments to?

One last really, really important point is tracking, so tracking all these things that you’re doing. Again, I know that it sounds silly. I found myself in this trap and I think I’m relatively intelligent, but it’s really easy when you’re trying to move fast to lose sight of some of these basics, and so just make sure that you’re tracking everything that you’re doing. If you’re not flagging properly who got this experiment and didn’t or sending an event when they interacted with that experiment. If you add a new button, if you’re not tracking how many times the button is clicked, then it’s not going to matter at all. So it doesn’t matter that you shipped the new button. You need some way to measure against that over time.

And last, but not least, define success ahead of time in hypothesis that you built into your experiment. If your experiment is, “We’re going to change the dashboard and we believe that changing the dashboard this way will result in 10% higher activation rates,” define that, track against it, and have a control group and make sure that you can definitively say at the end of the day this thing worked or it didn’t, or else you’re going to find yourself in a really tough scenario where you can’t really pull a true answer. The worst possible thing that you can have is you’ve done this a few times and then you start to pull numbers of, “Yeah, we think that that was, we think that that one worked,” because then you immediately start to lose credibility. The changes that you’re making are possibly just spinning your wheels because you don’t have the control groups to say if the thing that you did changed it or not.

So, again, control groups matter. Trust me. They really, really matter, and you’re going to have a hard time if you don’t have a good way to set them up in the first place. Thanks to listening to #Growth. I’ll catch you next time.