The dangers of A/B testing

This is a great article on where A/B testing can wrong. It rings true with me. I’ve recently been A/B testing a lot of the ads we run for Codewi.se, and I admit, in my impatience, I’m often ready to read significance into results before enough data has come in.

Last week I tossed a coin a hundred times. 49 heads. Then I changed into a red t-shirt and tossed the same coin another hundred times. 51 heads. From this, I conclude that wearing a red shirt gives a 4.1% increase in conversion in throwing heads.

A ridiculous experiment (yes, I really did it) with a ridiculous conclusion, yet I sometimes see similarly unreliable analysis in A/B testing.

It’s logical and laudable that designers should seek data in our quest for verifiability and return on investment. But data must be handled with care, and mathematical rigour isn’t a common part of a designer’s repertoire.

This bit sums up how reliance on tests can lead to bad design:

Logical positivism and design don’t mix – not everything we do can be empirically verified – yet some businesses fall back on A/B testing in lieu of genuine design thinking. I call this the “A/B death spiral”, and it plays out something like this:

Designer: Here’s a new design for this screen. You’ll see it has a new navigation style, tweaked colour palette and I’ve moved the main interactions to a tabbed area.

Product owner: Wow, those are pretty big changes for such a high-risk screen. I tell you what: let’s test them individually to see which of these changes works and which doesn’t…

As the proverb suggests, sometimes you can’t jump a twenty foot chasm in two ten foot leaps. Cherry-picking only those design elements that are “proven” by an A/B test can be a route to fragmented, incoherent design. It may earn marginally more money in the short term, but it becomes hard to avoid a descent into poor UX and the long-term harm this causes.

Leave a Reply