What Chickens Taught Me About A/B Testing
by Allison Otting • June 25, 2015
Early this Spring, some friends and I bought six little chicks. Cute, adorable little chicks.
If you’ve ever raised chickens, you know that roosters are kind of like teenage boys: loud, obnoxious and harass the females. Roosters are illegal in my city, so my friends and I knew that we didn’t want a rooster. We had chosen five chicks who had all been sexed as hens, with one little gamble chick.
As they grew, we noticed that our gamble chick was spunkier than the other five. There was a chance that we had a rooster, but with this breed it’s impossible to tell until they are a couple of months old. Since they were all supposed to be hens, I just figured the difference in this chick just had to do with its breed.
Well, after two months I was certain that it was a female. It didn’t have spurs or a crest, so we thought we were safe.
Unfortunately, one of us woke up early and found the little chicken (pathetically) trying to crow. We had a rooster on our hands and we had to give him away.
Honestly, it was heartbreaking, even though the little guy was a pain in the butt.
Nice Story, But What Does This Have to Do with A/B Testing?
With my chickens, I jumped to conclusions before enough time had passed to get accurate data. I knew better. I had talked to other chicken owners who had all told me the same thing. You can’t tell male and female chickens apart for at least two months.
The problem was, I wanted the chicken to be a hen. So, at one month, I had convinced myself that he was a she. My data, chick and conclusions were immature, but I was even telling people he was most certainly a hen because I was emotionally attached to him.
It’s easy to make the same mistake with our marketing efforts. Early results look encouraging or even dramatically positive and we jump to conclusions. Then, we get attached to those conclusions, which only makes matters worse, because we often end up ignoring data that tells us we’re wrong or we make a decision before enough data has come in!
Unfortunately, this sort of mistake happens all the time in A/B testing, especially amongst new marketers or with slow traffic campaigns.
It even gets me sometimes. For example, last week I started writing a case study about a client with great new results. In fact, I finished the entire blog post and then realized that the data I was going to use to back everything up had changed dramatically.
Don’t Count Your Landing Pages Before They Hatch
Here’s what happened: We have a client that provides surety bonds for all sorts of situations. Not surprisingly, their landing page needed to exude the sort of expertise and trust you expect from a bond company. So, we set up their landing page to provide a little bit of information about their services, some testimonials and a multi-step form for a bond quote.
The page was performing well enough; but, after performing a dozen new variant tests on this page, nothing was moving the needle in any substantial sort of way. We tried headlines, buttons and even a few full redesigns. We were starting to feel stumped…and then I looked at our testimonials.
Social proof can be great, but this seemed like one of those times where it might be backfiring. The testimonials were a bit sketchy looking when it came to content and photos, so I decided that it was time for a change.
The “fix”
At first, I tried reformatting the testimonials to look like pull quotes without photos. That didn’t do the trick. I moved them around to see if position would do anything. Nada. Finally, I thought I’d try a bit of existence testing and remove the testimonials completely.
Guess what? Over the next two months we saw a steady 15% bump in conversions! It was a miracle.
I concluded that the testimonials were hurting the landing page and we killed the original variant off. I was pretty proud of myself for figuring out the problem, so I wrote up a blog post about my results and was just about ready to publish…until I went back to take the data screenshots I needed.
Instead of great results, I found a surprise.
Not only had the overall conversion rate dropped, but it was now below the original variant it had beat!
What did I do wrong?
Honestly, there are a lot of possible explanations for this sudden change in results.
- It could have been a generally bad month for the industry.
- The test might have been set up wrong from the beginning.
- Maybe I just didn’t wait long enough to get the results.
Unfortunately, this change leaves me in an awkward position. My great winning test is now a dud and I’m not really sure why. In hindsight, we didn’t start our variants at the same time, so the original has a lot more backlogged data in its conversion rate, which makes it difficult to tell what is really going on.
Overall, the whole situation has left me feeling more confused than enlightened.
Conclusion
The moral of the story is that—whether you’re raising chickens or split-testing landing pages—you can’t let your conclusions outrun your data.
Fortunately, we already have another variant beating both of the originals, so it looks like we’ll end up with a winner after all. It turns out that people buying bonds like more information about said bonds. BUT, I don’t want to jump to conclusions.
After all, I’ve promised myself that I’d stop making any assumptions about my landing pages or chickens.
Ever find yourself in a similar situation? What did you learn?