With the holiday shopping season in full swing, a lot of little elves are closely watching how Wal-Mart, the biggest of all big-box retailers, has priced favorite "Dear Santa" items and other goodies. But have you ever stopped to wonder what Wal-Mart might have on its Christmas wish list?
I'll give you a hint: big-data. No, make that ginormous-data.
If Wal-Mart were to compile a Christmas wish list, the world's largest product repository would likely top it off. The company is intent on building such a database itself, so I'd venture to guess it wouldn't pass up a sprinkling of magic from the sugarplum fairies.
This would be no ordinary product repository, either (as if you could possibly have thought otherwise). Wal-Mart would like it to include detailed information on literally every product in the known world, Digvjay Lamba, a distinguished architect with Walmart Labs, told attendees at the Predictive Analytics Innovation Summit held by The Innovation Enterprise this month in Chicago. What's the product called? What retailers are selling it? How much are they charging? What are people saying about it in the social sphere? These are all questions Wal-Mart would like to answer from this single information source.
For perspective, Lamba took us, not forward to Christmas, but back to Halloween, the second-largest commercial holiday, with $10 billion in gross merchandise sales. "With 9 billion pieces of candy and 100 million pumpkins sold each year, it's a massive, massive festival," he said. "But it's also a big festival for big-data." Consider these stats he shared: 1 billion transactions and 1 billion visits to Halloween-related Websites every year, 100 million customers, 100 million tweets, 20 million check-ins, 1 million YouTube videos, 50 million images on Instagram, and 1 billion pages on Google if you search for Halloween 2012. "As time has gone along, the amount of data on Halloween has grown exponentially."
In the pre-big-data world of, say, 2007, a retailer like Wal-Mart would prepare for the Halloween shopping season by looking at historical costume sales and doing some market trend analysis. "All the data we were using was our own data, our own sales data, and some survey data." That has obviously changed. "We've got all this extra data outside of Wal-Mart telling us what's going on. We've got people tweeting and uploading videos -- and these can help us drive more insights on what we should really be selling for Halloween… and when we should start selling."
Answering the key question "Can we predict what is going to sell this year?" takes that massive product repository plus a combination of domain expertise and data science smarts, says Lamba, who joined Wal-Mart last year when it acquired Kosmix, a search and social media analytics platform provider (Wal-Mart's own sugarplum fairy, perhaps). It's all about what Walmart Labs calls its Social Genome Platform -- a system for coming up with unexpected insights. The platform comprises five essential nodes, each of which is a taxonomy in itself: products, people, locations, events, and interests.
We build these taxonomies, and then when any data comes in -- transactions, social status updates, images and videos, blogs and Web updates, check-ins and locations -- we mark them with what they're talking about on this taxonomy. So we take this unstructured data, and we annotate it so we know what it's really talking about in terms of these taxonomies that we've built so we can give structure to the data.
Then comes the most interesting part, using the tagged, annotated data (whether that is transaction data or social data) to build a dashboard for product managers and business users so they, not the data scientists, can come up with the insights and ideas. From a dashboard, for example, a business manager could see that zombie parties are trending in San Francisco, while handmade costumes are popular in New York City. "The fact that event themes can drive what we sell in different areas doesn't have to be something that the data scientist has to come up with. The business owners can drill down into these verticals across different dimensions and see what's out there."
Walmart Labs' sample (and simplistic) Social Genome Platform dashboard.
Walmart Labs is well on its way to making this sort of scenario a reality. But Lamba says it's not easy. Besides the technology, of course, there are people issues. Not every domain expert wants to or should be working with big-data. But identifying those who do, and can, means more power to the business -- and that's good no matter what holiday season is upon us.