Tag Archives: testing metrics

The Myth of the Single KPI for Testing

Continuous Improvement through testing is a simple idea. That’s no surprise. The simplest, most obvious ideas are often the most powerful. And testing is a powerful idea. An idea that forms and shapes the way digital is done by the companies that do it best. And those same companies have changed the world we live in.

If testing and continuous improvement is a process, analytics is the driver of that process; and as any good driver knows, the more powerful the vehicle, the more careful you have to be as a driver. Testing analytics seems so easy. You run a test, you measure which worked better. You choose the winner.

It’s like reading the scoreboard at a football game. It doesn’t take a lot of brains to figure out who’s ahead.

Except it’s usually not that easy.

Sporting events just are decided by the score. Games have rules and a single goal. Life and business mostly don’t. What makes measuring tests surprising tricky is that you rarely have a single unequivocal measure of success.

Suppose you add a merchandising drive to a section of your store or on the product detail page of your website. You test. And you generate more sales of that product.

Success!

Success?

Let’s start with the obvious caveat. You may have generated more sales, but you gave up margin. Was it worth it? Usually, the majority of buyers with a discount would have bought without one. Still, that kind of cannibalization is fairly easy to baseline and measure.

Here’s a trickier problem. What else changed? Because when you add a merchandising drive to a product, you don’t just shift that product’s buying pattern. The customer who buys might have bought something else. Maybe something with a better margin.

To people who don’t run tests, this may come as a bit of surprise. Shouldn’t tests be designed to limit their impact so that the “winner” is clear? ‘

Part of a good experimental design is, indeed, creating a test that limits external impacts. But this isn’t the lab. Limiting the outside impact of a test isn’t easy and you can  never be sure you’ve actually succeeded in doing that unless you carefully measure.

Worse, the most important tests usually have the most macro-impact. Small creative tests can often be isolated to a single win-loss metric. Sadly, that metric usually doesn’t matter or doesn’t move.

If you need proof of that, check out this meta-study by Will Browne & Mike Jones (those names feel like generic test products, right?) that looked at the impact of different types of test. Their finding? UI changes of the color and call-to-action type had, essentially, zero impact. Sadly, that’s what most folks spend all their time testing. (http://www.qubit.com/sites/default/files/pdf/qubit_meta_analysis.pdf)

If your test actually changes shopper behavior, believe me, there will be macro impacts.

It’s usually straightforward to measure the direct results of a store test. It’s often much harder to determine the macro impact. But it’s something you MUST look at. The macro impact can be as or more important than the direct impact. What’s more, it often – I’ll say usually – runs in the opposite direction.

So if you fail to measure the macro impact of a store test and you focus only on the obvious outcome, you’ll often pick the wrong result or grossly overstate the impact. Either way, you’re not using your analytics to drive appropriately.

Of course, one of the very real challenges you’ll face is that many tools don’t measure the macro impact of tests at all. In the digital world, the vast majority of dedicated testing tools require you to focus on a single KPI and provide absolutely no measurement of macro impacts. They simply assume that the test was completely compartmentalized. That works okay for things like email testing, but it’s flat-out wrong when it comes to testing store or website changes.

If your experiment worked well enough to change a shopper’s behavior and got them to buy something, the chances are quite good that it changed more than just that behavior. You may have given up margin. You likely lost some sales elsewhere. You almost certainly changed what else in the store or the site the shopper engaged with. That stuff matters.

In the store world, most tools don’t measure enough to give you even the immediate win-loss results. To heck with the rest of the story. So it can tempting, when you first have real measurement, to focus on the obvious: which test won. Don’t.

In some of my recent posts, I’ve talked about the ways in which DM1 – our store testing and measurement platform – lets you track the full customer journey, segmentfunnel and compare. Those capabilities are key to doing test measurement right. They give you the ability to see the immediate impact of a test AND the ways in which a change affected macro customer behavior.

You can see an example of how this works (and how important that macro behavior is in store layout) in this DM1 video that focuses on the Comparison capabilities of the tool.

https://www.youtube.com/watch?v=lbpaeSmaE74&t=13s

It’s the right way to use all that power a store testing program can provide.