# Machine Learning and Optimal Store Path

My previous post covered the first half of my presentation on Machine Learning (ML) and store analytics at the Toronto Symposium. Here, I’m going to work through the case study on using ML to derive an optimal store path. For that analysis, we used our DM1 platform to source, clean, map and aggregate the data and then worked with a data science partner (DXi) on the actual analysis.

Why this problem?

Within DM1 we feel pretty good about the way we’ve built out visualizations of the store data that are easy to use and surprisingly powerful. The Full Path View, Funnel View and Store Layout View  all provide really good ways to explore shopper path in the store.

But for an analyst, exploring data and figuring out a model are utterly different tasks. A typical store presents a nearly infinite number of possible paths – even when the paths are aggregated up to section level. So there’s no way to just explore the paths and find optimal ones.

Even at the most basic level of examining individual shopper paths, deciding what’s good and bad is really hard. Here’s two shopper paths in a store:

Which is better? Does either have issues? It’s pretty hard to know.

Why Machine Learning?

Optimal store pathing meets the basic requirements for using supervised ML – we have a lot of data and we have a success criteria (checkout). But ML isn’t worth deploying on every problem that has a lot of data and success criteria. I think about it this way – if I can get want I want by writing simple algorithmic code, then I don’t need ML. In other words, if I can write (for example) a sort and then some simple If-Then rules that will identify the best path or find problem path points, then that’s what we’ll do. If, for example, I just wanted to identify sections that didn’t convert well, it would be trivial to do that. I have a conversion efficiency metric, I sort by it (Ascending) and then I take the worst performers. Or maybe I have a conversion threshold and simply pick any Section that performs worse. Maybe I even calculate a standard deviation and select any section that is worse than 1 standard deviation below the average Section conversion efficiency. All easy.

But none of those things are really very useful when it comes to finding poor path performance in a robust fashion.

So we tried ML.

The Analysis Basics

The analysis was focused on a mid-sized apparel store with around 25 sections. We had more than 25,000 shopper visits. Which may not seem like very much if you’re used to digital analytics, but is a pretty good behavior base for a store. In addition to the basic shopper journey, we also had Associate interaction points (and time of interaction), and whether or not the shopper converted. The goal was to find potential store layout problems and understand which parts of the store drove to (or subtracted from) overall conversion efficiency.

Preparing the Data

The first step in any analysis (once you know what you want) is usually data preparation.

Our data starts off as a stream of location events. Those location events have an X,Y, Z coordinates that are offset from a zero point in the store. In the DM1 platform, we take that data and map it against a digital planogram capability that keeps a full, historical record of the store. That tells us what shoppers actually looked and where they spent time. This is the single most critical step in turning the raw data into something that’s analytically useful.

Since we also track Associates, we can track interaction points by overlaying the Associate data stream on top of the shopper stream. This isn’t perfect – it’s easy to miss short interactions or be confused by a crowded store – but particularly when it’s app to app tracking it works pretty well. Associate interaction points are hugely important in the store (as the subsequent analysis will prove).

Step 3 is knowing whether and when a shopper purchased. Most of the standard machine learning algorithms require having a way to determine if a behavior pattern was successful or not – that’s what they are optimizing too. We’re using purchase as our success metric.

The underlying event data gets aggregated into a single row per shopper visit. That row contains a visit identifier, a start and stop time, an interaction count, a first interaction time, a last interaction time, the first section visited, the time spent in each section and, of course, our success metric – a purchase flag.

That’s it.

The actual analytic heavy lifting was done by DXi on their machine learning platform. They use an ensemble approach – throwing the kitchen sink at the problem by using 25+ different algorithms to identify potential winners/losers (if you’d like more info or an introduction to them, drop me a line and I’ll connect you).

Findings

Here’s some of the interesting stuff that surfaced, plucked from the Case-Study I gave at the Symposium:

One of the poorest performing sections – unpicked by a single DXi ML algorithm as important – sits right smack dab in the middle of the store. That central position really surprised us. Yes, as you’ll see in a moment, the store has a successful right rail pattern – but this was a fairly trafficked spot with good sightlines and easy flow into high-value areas of the store.

Didn’t work well though. And that’s definitely worth thinking about from a store layout perspective.

One common browsing behavior for shoppers is a race-track pattern – navigating around the perimeter of the store. There’s a good example of that on the right-side image I showed earlier:

The main navigation path through the store is the red rectangle (red because this shopper spent considerable time there) – and you can see that while the shopper frequently deviated from that main path that their overall journey was a circuit around the store.

The ML algo’s don’t know anything about that – but they did pick out the relevant sections in the analyzed store along that starting path as really important for conversion.

We took that to mean that the store is working well for that race-track shopper type. An important learning.

For this particular store, casual shoes was picked as important by every ML algorithm – making it the most important section of the store. It also had the largest optimal time value – and clearly rewarded more time with higher conversion rates. Shoes, of course, is going to be this way. It’s not a grab and go item. So there’s an element of the obvious here – something you should expect when you unleash ML on a dataset (and hey – most analytics projects will, if they work at all, vacillate between the interesting and the obvious). But even compared to other types of shoe – this section performed better and rewarded more time spent – so there is an apples-to-apples part of this comparison as well.

The next finding was an interesting one and illustrates a bit of the balance you need to think about between the analyst and the algorithm. The display in question was located fairly close to cash-wrap on a common path to checkout. It didn’t perform horribly in the ML – some of the DXi algorithms did pick it as important for conversion. On the other hand, it was one of the few sections with a negative weighting to time spent – so more time spent means less likely conversion. We interpreted that combination as indicating that the section’s success was driven by geography not efficiency. It’s kind of like comparing Saudi Arabia vs. U.S. Shale drillers. Based purely on the numbers, Saudi Arabia looks super efficient and successful with the lowest cost per barrel of oil extracted in the world. But when you factor in the geographic challenges, the picture changes completely. SA has the easiest path to oil recovery in the world. Shale producers face huge and complex technical challenges and still manage to be price competitive. Geography matters and that’s just a core fact of in-store analytics.

Our take on the numbers when we sifted through the DXi findings was that this section was actually underperforming. It might take a real A/B test to prove that, but regardless I think it’s a good example of how an analyst has to do more than run an algorithm. It’s easy to fool even very sophisticated algorithms with strong correlations and so much of our post-analysis ANALYSIS was about understanding how the store geography and the algorithm results play together.

In addition to navigation findings like these, the analysis also included the impact of Associates on conversion. In general, the answer we got was the more interactions the merrier (at the cash register). Not every store may yield the same finding (and it’s also worth thinking about whether a single conversion optimization metric is appropriate here – in my Why Analytics Fails talk I argue for the value in picking potentially countervailing KPIs like conversion and shopper satisfaction as dual optimization points).

Even after multiple interactions, additional interactions had a positive impact on sales.

This should be obvious but I’ll hearken back to our early digital analytics days to make a point. We sometimes found that viewing more pages on a Website was a driver of conversion success. But that didn’t mean chopping pages in half (as one client did) so that that the user had to consume more pages to read the same content was a good strategy.

Just because multiple Associate interactions in a store with a normal interaction strategy created lift, it doesn’t mean that, for example, having your Associates tackle customers (INTERACTIOOOON!!!) as they navigate the floor will boost conversion.

But in this case, too much interaction was a legitimate concern. And the data indicates that – at least as measured by conversion rates – the concern did not manifest itself in shopper turn-off.

If you’re interested in getting the whole deck – just drop me a note. It’s a nice intro into the kind of shopper journey tracking you can do with our DM1 platform and some of the ways that machine learning can be used to drive better practice. And, as I mentioned, if you’d like to check out the DXi stuff – and it’s interesting from a pure digital perspective too – drop me a line and I’ll introduce you.

# Machine Learning and Optimizing the Store

My previous post covered the first half of my presentation on Machine Learning (ML) and store analytics at the Toronto Symposium. Here, I’m going to work through the case study on using ML to derive an optimal store path. For that analysis, we used our DM1 platform to source, clean, map and aggregate the data and then worked with a data science partner (DXi) on the actual analysis.

Why this problem?

Within DM1 we feel pretty good about the way we’ve built out visualizations of the store data that are easy to use and surprisingly powerful. The Full Path View, Funnel View and Store Layout View  all provide really good ways to explore shopper path in the store.

But for an analyst, exploring data and figuring out a model are utterly different tasks. A typical store presents a nearly infinite number of possible paths – even when the paths are aggregated up to section level. So there’s no way to just explore the paths and find optimal ones.

Even at the most basic level of examining individual shopper paths, deciding what’s good and bad is really hard. Here’s two shopper paths in a store:

Which is better? Does either have issues? It’s pretty hard to know.

Why Machine Learning?

Optimal store pathing meets the basic requirements for using supervised ML – we have a lot of data and we have a success criteria (checkout). But ML isn’t worth deploying on every problem that has a lot of data and success criteria. I think about it this way – if I can get want I want by writing simple algorithmic code, then I don’t need ML. In other words, if I can write (for example) a sort and then some simple If-Then rules that will identify the best path or find problem path points, then that’s what we’ll do. If, for example, I just wanted to identify sections that didn’t convert well, it would be trivial to do that. I have a conversion efficiency metric, I sort by it (Ascending) and then I take the worst performers. Or maybe I have a conversion threshold and simply pick any Section that performs worse. Maybe I even calculate a standard deviation and select any section that is worse than 1 standard deviation below the average Section conversion efficiency. All easy.

But none of those things are really very useful when it comes to finding poor path performance in a robust fashion.

So we tried ML.

The Analysis Basics

The analysis was focused on a mid-sized apparel store with around 25 sections. We had more than 25,000 shopper visits. Which may not seem like very much if you’re used to digital analytics, but is a pretty good behavior base for a store. In addition to the basic shopper journey, we also had Associate interaction points (and time of interaction), and whether or not the shopper converted. The goal was to find potential store layout problems and understand which parts of the store drove to (or subtracted from) overall conversion efficiency.

Preparing the Data

The first step in any analysis (once you know what you want) is usually data preparation.

Our data starts off as a stream of location events. Those location events have an X,Y, Z coordinates that are offset from a zero point in the store. In the DM1 platform, we take that data and map it against a digital planogram capability that keeps a full, historical record of the store. That tells us what shoppers actually looked and where they spent time. This is the single most critical step in turning the raw data into something that’s analytically useful.

Since we also track Associates, we can track interaction points by overlaying the Associate data stream on top of the shopper stream. This isn’t perfect – it’s easy to miss short interactions or be confused by a crowded store – but particularly when it’s app to app tracking it works pretty well. Associate interaction points are hugely important in the store (as the subsequent analysis will prove).

Step 3 is knowing whether and when a shopper purchased. Most of the standard machine learning algorithms require having a way to determine if a behavior pattern was successful or not – that’s what they are optimizing too. We’re using purchase as our success metric.

The underlying event data gets aggregated into a single row per shopper visit. That row contains a visit identifier, a start and stop time, an interaction count, a first interaction time, a last interaction time, the first section visited, the time spent in each section and, of course, our success metric – a purchase flag.

That’s it.

The actual analytic heavy lifting was done by DXi on their machine learning platform. They use an ensemble approach – throwing the kitchen sink at the problem by using 25+ different algorithms to identify potential winners/losers (if you’d like more info or an introduction to them, drop me a line and I’ll connect you).

Findings

Here’s some of the interesting stuff that surfaced, plucked from the Case-Study I gave at the Symposium:

One of the poorest performing sections – unpicked by a single DXi ML algorithm as important – sits right smack dab in the middle of the store. That central position really surprised us. Yes, as you’ll see in a moment, the store has a successful right rail pattern – but this was a fairly trafficked spot with good sightlines and easy flow into high-value areas of the store.

Didn’t work well though. And that’s definitely worth thinking about from a store layout perspective.

One common browsing behavior for shoppers is a race-track pattern – navigating around the perimeter of the store. There’s a good example of that on the right-side image I showed earlier:

The main navigation path through the store is the red rectangle (red because this shopper spent considerable time there) – and you can see that while the shopper frequently deviated from that main path that their overall journey was a circuit around the store.

The ML algo’s don’t know anything about that – but they did pick out the relevant sections in the analyzed store along that starting path as really important for conversion.

We took that to mean that the store is working well for that race-track shopper type. An important learning.

For this particular store, casual shoes was picked as important by every ML algorithm – making it the most important section of the store. It also had the largest optimal time value – and clearly rewarded more time with higher conversion rates. Shoes, of course, is going to be this way. It’s not a grab and go item. So there’s an element of the obvious here – something you should expect when you unleash ML on a dataset (and hey – most analytics projects will, if they work at all, vacillate between the interesting and the obvious). But even compared to other types of shoe – this section performed better and rewarded more time spent – so there is an apples-to-apples part of this comparison as well.

The next finding was an interesting one and illustrates a bit of the balance you need to think about between the analyst and the algorithm. The display in question was located fairly close to cash-wrap on a common path to checkout. It didn’t perform horribly in the ML – some of the DXi algorithms did pick it as important for conversion. On the other hand, it was one of the few sections with a negative weighting to time spent – so more time spent means less likely conversion. We interpreted that combination as indicating that the section’s success was driven by geography not efficiency. It’s kind of like comparing Saudi Arabia vs. U.S. Shale drillers. Based purely on the numbers, Saudi Arabia looks super efficient and successful with the lowest cost per barrel of oil extracted in the world. But when you factor in the geographic challenges, the picture changes completely. SA has the easiest path to oil recovery in the world. Shale producers face huge and complex technical challenges and still manage to be price competitive. Geography matters and that’s just a core fact of in-store analytics.

Our take on the numbers when we sifted through the DXi findings was that this section was actually underperforming. It might take a real A/B test to prove that, but regardless I think it’s a good example of how an analyst has to do more than run an algorithm. It’s easy to fool even very sophisticated algorithms with strong correlations and so much of our post-analysis ANALYSIS was about understanding how the store geography and the algorithm results play together.

In addition to navigation findings like these, the analysis also included the impact of Associates on conversion. In general, the answer we got was the more interactions the merrier (at the cash register). Not every store may yield the same finding (and it’s also worth thinking about whether a single conversion optimization metric is appropriate here – in my Why Analytics Fails talk I argue for the value in picking potentially countervailing KPIs like conversion and shopper satisfaction as dual optimization points).

Even after multiple interactions, additional interactions had a positive impact on sales.

This should be obvious but I’ll hearken back to our early digital analytics days to make a point. We sometimes found that viewing more pages on a Website was a driver of conversion success. But that didn’t mean chopping pages in half (as one client did) so that that the user had to consume more pages to read the same content was a good strategy.

Just because multiple Associate interactions in a store with a normal interaction strategy created lift, it doesn’t mean that, for example, having your Associates tackle customers (INTERACTIOOOON!!!) as they navigate the floor will boost conversion.

But in this case, too much interaction was a legitimate concern. And the data indicates that – at least as measured by conversion rates – the concern did not manifest itself in shopper turn-off.

If you’re interested in getting the whole deck – just drop me a note. It’s a nice intro into the kind of shopper journey tracking you can do with our DM1 platform and some of the ways that machine learning can be used to drive better practice. And, as I mentioned, if you’d like to check out the DXi stuff – and it’s interesting from a pure digital perspective too – drop me a line and I’ll introduce you.

# Machine Learning and Store Analytics

Not too long ago I spoke in Toronto at a Symposium focused on Machine Learning to describe what we’ve done and are trying to do with Machine Learning (ML) in our DM1 platform and with store analytics in general. Machine Learning is, in some respects, a fraught topic these days. When something is hard on the hype cycle, the tendency is to either believe it’s the answer to every problem or to dismiss the whole thing as an illusion. The first answer is never right. The second sometimes is. But ML isn’t an illusion – it’s a real capability with a fair number of appropriate applications. I want to cover – from our hands-on, practical perspective – where we’ve used ML, why we used ML and show a case-study of some of the results.

Just what is Machine Learning?

In its most parochial form, ML is really nothing more than a set of (fairly mature) statistical techniques dressed up in new clothes.

Here’s a wonderful extract from the class notes of a Stanford University expert on ML: (http://statweb.stanford.edu/~tibs/stat315a/glossary.pdf)

It’s pretty clear why we should all be talking ML not statistics! And seriously, wasn’t data science enough of a salary upgrade for statisticians without throwing ML into the hopper?

Unlike big data, I have no desire in this case to draw any profound definitional difference between ML and statistics. In my mind, I think of ML as being the domain of neural networks, deep learning and Support Vector Machines (SVMs). Statistics is the stuff we all know and love like regression and factor analysis and p values. That’s a largely ad hoc distinction (and it’s particularly thin on the unsupervised learning front), but I think it mostly captures what people are thinking when they talk about these two disciplines.

What Problems Have We Tried to Solve with ML

At a high-level, we’ve tackled three types of problems with ML (as I’ve casually defined it): improving data quality, shopper type classification, and optimal store path analysis.

Data quality is by far the least sexy of these applications, but it’s also the area where we’ve done the most work and where the everyday application of our platform takes actual advantage of some ML work.

When we setup a client instance on DM1, there’s a number of highly specific configurations that control how data gets processed. These configurations help guide the platform in key tasks like distinguishing Associate electronic devices from shopper devices. Why is this so important? Well, if you confuse Associates with shoppers, you’ll grossly over-count shoppers in the store. Equally bad, you’ll miss out on a real treasure trove of Associate data including when Associate/Shopper interactions occur, the ratio of Shoppers to Associates (STARs), and the length and outcome from interactions. That’s all very powerful.

If you identify store devices, it’s easy enough to signature them in software. But we wanted a system that would do the same work without having to formally identify store devices. Not only does this make it a lot easier to setup a store, it fixes a ton of compliance issues. You may tell Associates not to carry their own devices on the floor, but if you think that rule is universally followed your kidding yourself. So even if you BLE badge employees, you’re still likely picking up their personal phones as shopper devices. By adding behavioral identification of Associates, we make the data better and more accurate while minimizing (in most cases removing) operational impact.

We use a combination of rule-based logic and ML to classify Associate behavior on ALL incoming devices. It turns out that Associates behave quite differently in stores than shoppers. They spend more time. Go places shoppers can’t. Show up more often. Enter at different times. Exit at different times. They’re different. Some of those differences are easily captured in simple IF-then programming logic – but often the patterns are fairly complex. They’re different, but not so easily categorized. That’s where the ML kicks in.

We also work in a lot of electronically dense environments. So we not only need to identify Associates, we need to be able to pick-out static devices (like display computers, endless aisle tablets, etc.). That sounds easy, and in fact it is fairly easy. But it’s not quite as trivial as it sounds; given the vagaries of positioning tech, a static device is never quite static. We don’t get the same location every time – so we have to be able to distinguish between real movement and the type of small, Brownian motion we get from a static device.

Fixing data quality is never all that exciting, but in the world of shopper journey measurement it’s essential. Without real work to improve the data – work that ML happens to be appropriate for – the data isn’t good enough.

The second use we’ve found for machine learning is in shopper classification. We’re building a generalized shopper segmentation capability into the next release of DM1. The idea is pretty straightforward. For years, I’ve championed the notion of 2-tiered segmentation in digital analytics. That’s just a fancy name for adding a visit-type segmentation to an existing customer segmentation. And the exact same concept applies to stores.

As consultants, we typically built highly customized segmentation schemes. Since Digital Mortar is a platform company, that’s not a viable approach for us. Instead, what we’ve done is taken a set of fairly common in-store behavioral patterns and generalized their behavioral signatures. These patterns include things like “Clearance Shoppers”, “Right-Rail Shoppers”, “Single Product Focused Shoppers”, “Product Returners”, and “Multi-Product Browsers”. By mapping store elements to key behavior points, any store can then take advantage of this pre-existing ML-driven segmentation.

It’s pretty cool stuff and I’m excited to get it into the DM1 platform.

The last problem we’ve tackled with ML is finding optimal store paths. This one’s more complex – more complex than we’ve been comfortable taking on directly. We have a lot of experience in segmentation techniques – from cluster analysis to random forests to SVMs. We’re pretty comfortable with that problem set. But for optimal path analysis, we’ve been working with DXi. They’re an ML company with a digital heritage and a lot of experience working on event-level digital data. We’ve always said that a big part of what drew us to store journey measurement is how similar the data is to digital journey data and this was a chance to put that idea to the test. We’ve given them some of our data and had them work on some optimal path problems – essentially figuring out whether the store layout is as good as possible.

Why use a partner for this? I’ve written before about how I think Digital Mortar and the DM1 platform fit in a broader analytics technology stack for retail. DM1 provides a comprehensive measurement system for shopper tracking and highly bespoke reporting appropriate to store analytics. It’s not meant to be a general purpose analytics platform and it’s never going to have the capabilities of tools like Tableau or R or Watson. Those are super-powerful general-purpose analytics tools that cover a wide range of visualization, data exploration and analytic needs. Instead of trying to duplicate those solutions we’ve made it really easy (and free) to export the event level data you need to drive those tools from our platform data.

I don’t see DM1 becoming an ML platform. As analysts, we’ll continue to find uses for ML where we think it’s appropriate and embed those uses in the application. But trying to replicate dedicated ML tools in DM1 just doesn’t make a lot of sense to me.

In my next post, I’ll take a deeper dive into that DXi work, give a high-level view of the analytics process, and show some of the more interesting results.

# Connecting Marketers to Machine Learning: A Traveler’s Guide Through Two Utterly Dissimilar Worlds

Artificial Intelligence for Marketing by Jim Sterne

There are people in the world who work with and understand AI and machine learning. And there are people in the world who work with and understand marketing. The intersection of those two groups is a vanishingly tiny population.

Until recently the fact of that nearly empty set didn’t much matter. But with the dramatic growth in machine learning penetration into key marketing activities, that’s changed. If you don’t understand enough about these technologies to use them effectively…well…chances are some of your competitors do.

AI for Marketing, Jim Sterne’s new book,  is targeted specifically toward widening that narrow intersection of two populations into something more like a broad union. It’s not an introduction to machine learning for the data scientist or technologist (though there’s certainly a use and a need for that). It’s not an introduction to marketing (though it does an absolutely admirable job introducing practical marketing concepts). It’s a primer on how to move between those two worlds.

Interestingly, in AI for Marketing, that isn’t a one way street. I probably would have written this book on the assumption that the core task was to get marketing folks to understand machine learning. But AI for Marketing makes the not unreasonable assumption that as challenged as marketing folks are when it comes to AI, machine learning folks are often every bit as ignorant when it comes to marketing. Of course, that first audience is much larger – there’s probably 1000 marketing folks for every machine learner. But if you are an enterprise wanting two teams to collaborate or a technology company wanting to fuse your machine learning smarts to marketing problems, it makes sense to treat this as a two-way street.

Here’s how the book lays out.

Chapter 1 just sets the table on AI and machine learning. It’s a big chapter and it’s a bit of grab bag, with everything from why you should be worried about AI to where you might look for data to feed it. It’s a sweeping introduction to an admittedly huge topic, but it doesn’t do a lot of real work in the broader organization of the book.

That real work starts in Chapter 2 with the introduction to machine learning. This chapter is essential for Marketers. It covers a range of analytic concepts: an excellent introduction into the basics of how to think about models (a surprisingly important and misunderstood topic), a host of common analytics problems (like high cardinality) and then introduces core techniques in machine learning. If you’ve ever sat through data scientists or technology vendors babbling on about support vector machines and random forests, and wondered if you’d been airlifted into an incredibly confusing episode of Game of Drones, this chapter will be a godsend. The explanations are given in the author’s trademark style: simple, straightforward and surprisingly enjoyable given the subject matter. You just won’t find a better more straightforward introduction to these methods for the interested but not enthralled businessperson.

In Chapter 3, Jim walks the other way down the street – introducing modern marketing to the data scientist. After a long career explaining analytics to business and marketing folks, Jim has absorbed an immense amount of marketing knowledge. He has this stuff down cold and he’s every bit as good (maybe even better) taking marketing concepts back to analysts as he is working in the other direction.  From a basic intro into the evolution of modern marketing to a survey of the key problems folks are always trying to solve (attribution, mix, lifetime value, and personalization), this chapter nails it. If you subscribe to the theory (and I do) that any book on Marketing could more appropriately have been delivered as a single chapter, then just think of this as the rare book on Marketing delivered at the right length.

If you accept the idea that bridging these two worlds needs movement in both directions, the structure to this point is obvious. Introduce one. Introduce the other. But then what?

Here’s where I think the structure of the book really sings. To me, the heart of the book is in Chapters 4, 5 and 6 (which I know sounds like an old Elvis Costello song). Each chapter tackles one part of the marketing funnel and shows how AI and machine learning can be used to solve problems.

Chapter 4 looks at up-funnel activities around market research, public relations, social awareness, and mass advertising. Chapter 5 walks through persuasion and selling including the in-store journey (yeah!), shopping assistants, UX, and remarketing. Chapter 6 covers (you should be able to guess) issues around retention and churn including customer service and returns. Chapter 7 is a kind of “one ring to rule them all”, covering the emergence of integrated, “intelligent” marketing platforms that do everything. Well….maybe. Call me skeptical on this front.

Anyway, these chapters are similar in tone and rich in content. You get the core issues explained, a discussion of how AI and machine learning can be used, and brief introductions into the vendors and people who are doing the work. For the marketer, that means you can find the problems that concern you, get a sense of where the state of machine learning stands vis-à-vis your actual problem set, and almost certainly pick-up a couple of ideas about who to talk to and what to think about next.

If you’re into this stuff at all, these four chapters will probably get you pretty excited about the possibilities. So think of Chapter 8 as a cautionary shot across the bow. From being too good for your own good to issues around privacy, hidden biases and, repeat after me, “correlation is not causation” this is Pandora’s little chapter of analytics and machine learning troubles.

So what’s left? Think about having a baby. The first part is exciting and fun. The next part is long and tedious. And labor – the last part – is incredibly painful. It’s pretty much the same when it comes to analytics. Operationalizing analytics is that last, painful step. It comes at the end of the process and nobody thinks it’s any fun. Like the introduction to marketing, the section on operationalizing AI bears all the hallmarks of long, deep familiarity with the issues and opportunities in enterprise adoption of analytics and technology. There’s tons of good, sound advice that can help you actually get some of this stuff done.

Jim wraps up with the seemingly obligatory look into the future. Now, I’m pretty confident that none of us have the faintest idea how the future of AI is going to unfold. And if I really had to choose, I guess I prefer my crystal ball to be in science fiction form where I don’t have to take anything but the plot too seriously. But there’s probably a clause in every publisher’s AI book contract that an author must speculate on the how wonderful/dangerous the future will be. Jim keeps it short, light, and highly speculative. Mission accomplished.

Summing Up

I think of AI for Marketing as a handy guidebook into two very different, neighboring lands. For most of us, the gap between the two is an un-navigable chasm. AI for Marketing takes you into each locale and introduces you to the things you really must know about them. It’s a fine introduction not just into AI and Machine Learning but into modern marketing practice as well. Best of all, it guides you across the narrow bridges that connect the two and makes it easier to navigate for yourself.  You couldn’t ask for a wiser, more entertaining guide to walk you around and over that bridge between two utterly dissimilar worlds that grow everyday more necessarily connected.

Full Disclosure: I know and like the author – Jim Sterne – of AI for Marketing. Indeed, with Jim the verbs know and like are largely synonymous. Nor will I pretend that this doesn’t impact my thoughts on the work. When you can almost hear someone’s voice as you read their words, it’s bound to impact your enjoyment and interpretation. So absolutely no claim to be unbiased here!

# A Guided Tour through Digital Analytics (Circa 2016)

I’ve been planning my schedule for the DA Hub in late September and while I find it frustrating (so much interesting stuff!), it’s also enlightening about where digital analytics is right now and where it’s headed. Every conference is a kind of mirror to its industry, of course, but that reflection is often distorted by the needs of the conference – to focus on the cutting-edge, to sell sponsorships, to encourage product adoption, etc.  With DA Hub, the Conference agenda is set by the enterprise practitioners who are leading groups – and it’s what they want to talk about. That makes the conference agenda unusually broad and, it seems to me, uniquely reflective of the state of our industry (at least at the big enterprise level).

So here’s a guided tour of my DA Hub – including what I thought was most interesting, what I choose, and why. At the end I hope that, like Indiana Jones picking the Holy Grail from a murderers row of drinking vessels, I chose wisely.

Session 1 features conversations on Video Tracking, Data Lakes, the Lifecycle of an Analyst, Building Analytics Community, Sexy Dashboards (surely an oxymoron), Innovation, the Agile Enterprise and Personalization. Fortunately, while I’d love to join both Twitch’s June Dershewitz to talk about Data Lakes and Data Swamps or Intuit’s Dylan Lewis for When Harry (Personalization) met Sally (Experimentation), I didn’t have to agonize at all, since I’m scheduled to lead a conversation on Machine Learning in Digital Analtyics. Still, it’s an incredible set of choices and represents just how much breadth there is to digital analytics practice these days.

Session 2 doesn’t make things easier. With topics ranging across Women in Analytics, Personalization, Data Science, IoT, Data Governance, Digital Product Management, Campaign Measurement, Rolling Your Own Technology, and Voice of Customer…Dang. Women in Analytics gets knocked off my list. I’ll eliminate Campaign Measurement even though I’d love to chat with Chip Strieff from Adidas about campaign optimization. I did Tom Bett’s (Financial Times) conversation on rolling your own technology in Europe this year – so I guess I can sacrifice that. Normally I’d cross the data governance session off my list. But not only am I managing some aspects of a data governance process for a client right now, I’ve known Verizon’s Rene Villa for a long time and had some truly fantastic conversations with him. So I’m tempted. On the other hand, retail personalization is of huge interest to me. So talking over personalization with Gautam Madiman from Lowe’s would be a real treat. And did I mention that I’ve become very, very interested in certain forms of IoT tracking? Getting a chance to talk with Vivint’s Brandon Bunker around that would be pretty cool. And, of course, I’ve spent years trying to do more with VoC and hearing Abercrombie & Fitch’s story with Sasha Verbitsky would be sweet. Provisionally, I’m picking IoT. I just don’t get a chance to talk IoT very much and I can’t pass up the opportunity. But personalization might drag me back in.

In the next session I have to choose between Dashboarding (the wretched state of as opposed to the sexiness of), Data Mining Methods, Martech, Next Generation Analytics, Analytics Coaching, Measuring Content Success, Leveraging Tag Management and Using Marketing Couds for Personalization. The choice is a little easier because I did Kyle Keller’s (Vox) conversation on Dashboarding two years ago in Europe. And while that session was probably the most contentious DA Hub group I’ve ever been in (and yes, it was my fault but it was also pretty productive and interesting), I can probably move on. I’m not that involved with tag management these days – a sign that it must be mature – so that’s off my list too. I’m very intrigued by Akhil Anumolu’s (Delta Airlines) session on Can Developers be Marketers? The Emerging Role of MarTech. As a washed-up developer, I still find myself believing that developers are extraordinarily useful people and vastly under-utilized in today’s enterprise. I’m also tempted by my friend David McBride’s session on Next Generation Analytics. Not only because David is one of the most enjoyable people that I’ve ever met to talk with, but because driving analytics forward is, really, my job. But I’m probably going to go with David William’s session on Marketing Clouds. David is brilliant and ASOS is truly cutting edge (they are a giant in the UK and global in reach but not as well known here), and this also happens to be an area where I’m personally involved in steering some client projects. David’s topical focus on single-vendor stacks to deliver personalization is incredibly timely for me.

Next up we have Millennials in the Analytics Workforce, Streaming Video Metrics, Breaking the Analytics Glass Ceiling, Experimentation on Steroids, Data Journalism, Distributed Social Media Platforms, Customer Experience Management, Ethics in Analytics(!), and Customer Segmentation. There are several choices in here that I’d be pretty thrilled with: Dylan’s session on Experimentation, Chip’s session on CEM and, of course, Shari Cleary’s (Viacom) session on Segmentation. After all, segmentation is, like, my favorite thing in the world. But I’m probably going to go with Lynn Lanphier’s (Best Buy) session on Data Journalism. I have more to learn in that space, and it’s an area of analytics I’ve never felt that my practice has delivered on as well as we should.

In the last session, I could choose from more on Customer Experience Management, Driving Analytics to the C-Suite, Optimizing Analytics Career-Oaths, Creating High-Impact Analytics Programs, Building Analytics Teams, Delivering Digital Products, Calculating Analytics Impact, and Moving from Report Monkey to Analytics Advisor. But I don’t get to choose. Because this is where my second session (on driving Enterprise Digital Transformation) resides. I wrote about doing this session in the EU early this summer – it was one of the best conversations around analytics I’ve had the pleasure of being part of. I’m just hoping this session can capture some of that magic. If I didn’t have hosting duties, I think I might gravitate toward Theresa Locklear’s (NFL) conversation on Return on Analytics. When we help our clients create new analytics and digital transformation strategies, we have to help them justify what always amount to significant new expenditures. So much of analytics is exploratory and foundational, however, that we don’t always have great answers about the real return. I’d love to be able to share thoughts on how to think (and talk) about analytics ROI in a more compelling fashion.

All great stuff.

We work in such a fascinating field with so many components to it. We can specialize in data science and analytics method, take care of the fundamental challenges around building data foundations, drive customer communications and personalization, help the enterprise understand and measure it’s performance, optimize relentlessly in and across channels, or try to put all these pieces together and manage the teams and people that come with that. I love that at a Conference like the Hub I get a chance to share knowledge with (very) like-minded folks and participate in conversations where I know I’m truly expert (like segmentation or analytics transformation), areas where I’d like to do better (like Data Journalism), and areas where we’re all pushing the outside of the envelope (IoT and Machine Learning) together. Seems like a wonderful trade-off all the way around.

https://www.digitalanalyticshub.com/dahub16-us/

# The State of the Art in Analytics – EU Style

I spent most of the last week at the fourth annual Digital Analytics Hub Conference outside London, talking analytics. And talking. And talking. And while I love talking analytics, thank heavens I had a few opportunities to get away from the sound of my own voice and enjoy the rather more pleasing absence of sounds in the English countryside.

With X Change no more, the Hub is the best conference going these days in digital analytics (full disclosure – the guys who run it are old friends of mine). It’s an immensely enjoyable opportunity to talk in-depth with serious practitioners about everything from cutting edge analytics to digital transformation to traditional digital analytics concerns around marketing analytics. Some of the biggest, best and most interesting brands in Europe were there: from digital and bricks-and-mortar behemoths to cutting-edge digital pure-plays to a pretty good sampling of the biggest consultancies in and out of the digital world.

As has been true in previous visits, I found the overall state of digital analytics in Europe to be a bit behind the U.S. – especially in terms of team-size and perhaps in data integration. But the leading companies in Europe are as good as anybody.

Here’s a sampling from my conversations:

Machine Learning

I’ve been pushing my team to grow in the machine learning space using libraries like TensorFlow to explore deep learning and see if it has potential for digital. It hasn’t been simple or easy. I’m thinking that people who talk as if you can drop a digital data set into a deep learning system and have magic happen have either:

1. Never tried it
2. Been trying to sell it

We’ve been having a hard time getting deep learning systems to out-perform techniques like Random Forests. We have a lot of theories about why that is, including problem selection, certain challenges with our data sets, and the ways we’ve chosen to structure our input. I had some great discussions with hardcore data scientists (and some very bright hacker analysts more in my mold) that gave me some fresh ideas. That’s lucky because I’m presenting some of this work at the upcoming eMetrics in Chicago and I want to have more impressive results to share. I’ve long insisted on the importance of structure to digital analytics and deep learning systems should be able to do a better job parsing that structure into the analysis than tools like random forests. So I’m still hopeful/semi-confident I can get better results.

In broader group discussion, one of the most controversial and interesting discussions focused on the pros-and-cons of black-box learning systems. I was a little surprised that most of the data scientist types were fairly negative on black-box techniques. I have my reservations about them and I see that organizations are often deeply distrustful of analytic results that can’t be transparently explained or which are hidden by a vendor. I get that. But opacity and performance aren’t incompatible. Just try to get an explanation of Google’s AlphaGo! If you can test a system carefully, how important is model transparency?

So what are my reservations? I’m less concerned about the black-boxness of a technique than I am its completeness. When it comes to things like recommendation engines, I think enterprise analysts should be able to consistently beat a turnkey blackbox (or not blackbox) system with appropriate local customization of the inputs and model. But I harbor no bias here. From my perspective it’s useful but not critical to understand the insides of a model provided we’ve been careful testing to make sure that it actually works!

Another huge discussion topic and one that I more in accord with was around the importance of not over-focusing on a single technique. Not only are there many varieties of machine learning – each with some advantages to specific problem types – but there are powerful analytic techniques outside the sphere of machine learning that are used in other disciplines and are completely untried in digital analytics. We have so much to learn and I only wish I had more time with a couple of the folks there to…talk!

New Technology

One of the innovations this year at the Hub was a New Technology Showcase. The showcase was kind of like spending a day with a Silicon Valley VC and getting presentations from the technology companies in their portfolio (which is a darn interesting way to spend a day). I didn’t know most of the companies that presented but there were a couple (Piwik and Snowplow) I’ve heard of. Snowplow, in particular, is a company that’s worth checking out. The Snowplow proposition is pretty simple. Digital data collection should be de-coupled from analysis. You’ve heard that before, right? It’s called Tag Management. But that’s not what Snowplow has in mind at all. They built a very sophisticated open-source data collection stack that’s highly performant and feeds directly into the cloud. The basic collection strategy is simple and modern. You send json objects that pass a schema reference along with the data. The schema references are versioned and updates are handled automatically for both backwardly compatible and incompatible updates. You can pass a full range of strongly-typed data and you can create cross-object contexts for things like visitors. Snowplow has built a whole bunch of simple templates to make it easier for folks used to traditional tagging to create the necessary calls. But you can pass anything to Snowplow – not just Web data. It’s very adaptable for mobile (far more so than traditional digital analytics systems) and really for any kind of data at all. Snowplow supports both real-time and batch – it’s a true lambda architecture. It seems to do a huge amount of the heavy lifting for you when it comes to creating a  modern cloud-based data collection system. And did I mention it’s open-source? Free is a pretty good price. If you’re looking for an independent data collection architecture and are okay with the cloud, you really should give it a look.

Cloud vs. On-Premise

DA Hub’s keynote featured a panel with analytics leaders from companies like Intel, ASOS and the Financial Times. Every participant was running analytics in the cloud (with both AWS and Azure represented though AWS had an unsurprising majority). Except for barriers around InfoSec, it’s unclear to me why ANY company wouldn’t be in the cloud for their analytics.

Here in the States, there’s been widespread adoption of open-source data technologies (Hadoop/Spark) to process and analyze digital data. But while I do see companies that have completely abandoned traditional SaaS analytics tools, it’s pretty rare. Mostly, the companies I see run both a SaaS solution to collect data and (perhaps) satisfy basic reporting needs as well as an open-source data platform. There was more interest in the people I talked to in the EU about a complete swap out including data collection and reporting. I even talked to folks who roll most of the visualization stack themselves with open-source solutions like D3. There are places where D3 is appropriate (you need complete customization of the surrounding interface, for example, or you need widespread but very inexpensive distribution), but I’m very far from convinced that rolling your own visualization solutions with open-source is the way to go. I would have said that same thing about data collection but…see above.

Digital Transformation

I had an exhilarating discussion group centered around digital transformation. There were a ton of heavy hitters in the room – huge enterprises deep into projects of digital transformation, major consultancies, and some legendary industry vets. It was one of the most enjoyable conference experiences I’ve ever had. I swear that we (most of us anyway) could have gone on another 2 hours or more – since we just scratched the surface of the problems. My plan for the session was to cover what defines excellence in digital (what do you have to be able to do digital well), then tackle how a large-enterprise that wants to transform in digital needs to organize itself. Finally, I wanted to cover the change management and process necessary to get from here to there. If you’re reading this post that should sound familiar!

Well, we didn’t get to the third item and we didn’t finish the second. That’s no disgrace. These are big topics. But the discussion helped clarify my thinking – especially around organization and the very real challenges in scaling a startup model into something that works for a large enterprise. Much of the blending of teams and capabilities that I’ve been recommending in these posts on digital transformation are lessons I’ve gleaned from seeing digital pure-plays and how they work. But I’ve always been uncomfortably aware that the process of scaling into larger teams creates issues around corporate communications, reporting structures, and career paths that I’m not even close to solving. Not only did this discussion clarify and advance my thinking on the topic, I’m fairly confident that it was of equal service to everyone else. I really wish that same group could have spent the whole day together. A big THANKS to everyone there, you were fantastic!

I plan to write more on this in a subsequent post. And I may drop another post on Hub learnings after I peruse my notes. I’ve only hit on the big stuff – and there were a lot of smaller takeaways worth noting.

As I mentioned in my last post, the guys who run DA Hub are bringing it to Monterey, CA (first time in the U.S.) this September. Do check it out. It’s worth the trip (and the venue is  pretty special). I think I’m on the hook to reprise that session on digital transformation. And yes, that scares me…you don’t often catch lightning in a bottle twice.

# Space 2.0

The New Frontier of Commercial Satellite Imagery for Business

One of my last speaking gigs of the spring season was, for me, both the least typical and one of the most interesting. Space 2.0 was a brief glimpse into a world that is both exotic and fascinating. It’s a gathering of high-tech, high-science companies driving commercialization of space.

Great stuff, but what the heck did they want with me?

Well, one of the many new frontiers in the space industry is the commercialization of geo-spatial data. For years now, the primary consumer of satellite data has been the government. But the uses for satellite imagery are hardly limited to intel and defense. For the array of Space startups and aggressive tech companies, intel and defense are relatively mature markets – slow moving and difficult to crack if you’re not an established player. You ever tried selling to the government? It’s not easy.

So the big opportunity is finding ways to open up the information potential in geo-spatial data and satellite imagery to the commercial marketplace. Now I may not know HyperSpectral from IR but I do see a lot of the challenges that companies face both provisioning and using big data. So I guess I was their doom-and-gloom guy – in my usual role of explaining why everything always turns out to be harder than we expect when it comes to using or selling big data.

For me, though, attending Space 2.0 was more about learning that educating. I’ve never had an opportunity to really delve into this kind of data and hearing (and seeing) some of what is available is fascinating.

Let’s start with what’s available (and keep in mind you’re not hearing an expert view here – just a fanboy with a day’s exposure). Most commercial capture is visual (other bands are available and used primarily for environmental and weather related research). Reliance on visual spectrum has implications that are probably second-nature to folks in the industry but take some thought if you’re outside it. Once speaker described their industry as “outside” and “daytime” focused. It’s also very weather dependent. Europe, with its abundant cloudiness, is much more challenging than the much of the U.S. (though I suppose Portland and Seattle must be no picnic).

Images are either panchromatic (black and white), multi-spectral (like the RGB we’re used to but with an IR band as well and sometimes additional bands) or hyperspectral (lots of narrow bands on the spectrum). Perhaps even more important than color, though, is resolution. As you’d probably expect, black and white images tend to have the highest resolution – down to something like a 30-40cm square. Color and multi-band images might be more in the meter range but the newest generation take the resolution down to the 40-50cm range in full color. That’s pretty fine grained.

How fine-grained? Well, with a top-down 40cm square per pixel it’s not terribly useful for things like people. But here’s an example that one of the speakers gave in how they are using the data. They pick selected restaurant locations (Chipotle was the example) and count cars in the parking lot during the day. They then compare this data to previous periods to create estimates of how the location is doing. They can also compare competitor locations (e.g. Panera) to see if the trends are brand specific or consistent.

Now, if you’re Chipotle, this data isn’t all that interesting. There are easy ways to measure your business than trying to count cars in satellite images. But if you’re a Fund Manager looking to buy or sell Chipotle stock in advance of earnings reports, this type of intelligence is extremely valuable. You have hard-data on how a restaurant or store is performing before everyone else. That’s the type of data that traders live for.

Of course, that’s not the only way to get that information. You may have heard about the recent FourSquare prediction targeted to exactly the same problem. Foursquare was able to predict Chipotle’s sales decline almost to the percentage point. As one of the day’s panelist’s remarked, there are always other options and the key to market success is being cheaper, faster, easier, and more accurate than alternative mechanisms.

You can see how using Foursquare data for this kind of problem might be better than commercial satellite. You don’t have weather limitations, the data is easier to process, it covers walk-in and auto traffic, and it covers a 24hr time band. But you can also see plenty of situations where satellite imagery might have advantages too. After all, it’s easily available, relatively inexpensive, has no sampling bias, has deep historical data and is global in reach.

So how easy is satellite data to use?

I think the answer is a big “it depends”. This is, first of all, big data. Those multi and hyper band images at hi-res are really, really big. And while the providers have made it quite easy to find what you want and get it, it didn’t seem to me that they had done much to solve the real big data analytics problem.

I’ve described what I think the real big data problem is before (you can check out this video if you want a big data primer). Big data analytics is hard because it requires finding patterns in the data and our traditional analytics tools aren’t good at that. This need for pattern recognition is true in my particular field (digital analytics), but it’s even more obviously true when it comes to big data applications like facial recognition, image processing, and text analytics.

On the plus side, unlike digital analytics, the need for image (and linguistic) processing is well understood and relatively well-developed. There are a lot of tools and libraries you can use to make the job easier. It’s also a space where deep-learning has been consistently successful so that libraries from companies like Microsoft and Google are available that provide high-quality deep-learning tools – often tailor made for processing image data – for free.

It’s still not easy. What’s more, the way you process these images is highly likely to be dependent on your business application. Counting cars is different than understanding crop growth which is different than understanding storm damage. My guess is that market providers of this data are going to have to develop very industry-specific solutions if they want to make the data reasonably usable.

That doesn’t necessarily mean that they’ll have to provide full on applications. The critical enabler is providing the ability to extract the business-specific patterns in the data – things like identifying cars. In effect, solving the hard part of the pattern recognition problem so that end-users can focus on solving the business interpretation problem.

Being at Space 2.0 reminded me a lot of going to a big data conference. There’s a lot of technologies (some of them amazingly cool) in search of killer business applications. In this industry, particularly, the companies are incredibly sophisticated technically. And it’s not that there aren’t real applications. Intelligence, environment and agriculture are mature and profitable markets with extensive use of commercial satellite imagery. The golden goose, though, is opening up new opportunities in other areas. Do those opportunities exist? I’m sure they do. For most of us, though, we aren’t thinking satellite imagery to solve our problems. And if we do think satellite, we’re likely intimidated by difficulty of solving the big data problem inherent in getting value from the imagery for almost any new business application.

That’s why, as I described it to the audience there, I suspect that progress with the use and adoption of commercial satellite imagery will seem quite fast to those of us on the outside – but agonizingly slow to the people in the industry.