With the holiday shopping season in full swing, a lot of little elves are closely watching how Wal-Mart, the biggest of all big-box retailers, has priced favorite "Dear Santa" items and other goodies. But have you ever stopped to wonder what Wal-Mart might have on its Christmas wish list?
I'll give you a hint: big-data. No, make that ginormous-data.
If Wal-Mart were to compile a Christmas wish list, the world's largest product repository would likely top it off. The company is intent on building such a database itself, so I'd venture to guess it wouldn't pass up a sprinkling of magic from the sugarplum fairies.
This would be no ordinary product repository, either (as if you could possibly have thought otherwise). Wal-Mart would like it to include detailed information on literally every product in the known world, Digvjay Lamba, a distinguished architect with Walmart Labs, told attendees at the Predictive Analytics Innovation Summit held by The Innovation Enterprise this month in Chicago. What's the product called? What retailers are selling it? How much are they charging? What are people saying about it in the social sphere? These are all questions Wal-Mart would like to answer from this single information source.
For perspective, Lamba took us, not forward to Christmas, but back to Halloween, the second-largest commercial holiday, with $10 billion in gross merchandise sales. "With 9 billion pieces of candy and 100 million pumpkins sold each year, it's a massive, massive festival," he said. "But it's also a big festival for big-data." Consider these stats he shared: 1 billion transactions and 1 billion visits to Halloween-related Websites every year, 100 million customers, 100 million tweets, 20 million check-ins, 1 million YouTube videos, 50 million images on Instagram, and 1 billion pages on Google if you search for Halloween 2012. "As time has gone along, the amount of data on Halloween has grown exponentially."
In the pre-big-data world of, say, 2007, a retailer like Wal-Mart would prepare for the Halloween shopping season by looking at historical costume sales and doing some market trend analysis. "All the data we were using was our own data, our own sales data, and some survey data." That has obviously changed. "We've got all this extra data outside of Wal-Mart telling us what's going on. We've got people tweeting and uploading videos -- and these can help us drive more insights on what we should really be selling for Halloween… and when we should start selling."
Answering the key question "Can we predict what is going to sell this year?" takes that massive product repository plus a combination of domain expertise and data science smarts, says Lamba, who joined Wal-Mart last year when it acquired Kosmix, a search and social media analytics platform provider (Wal-Mart's own sugarplum fairy, perhaps). It's all about what Walmart Labs calls its Social Genome Platform -- a system for coming up with unexpected insights. The platform comprises five essential nodes, each of which is a taxonomy in itself: products, people, locations, events, and interests.
We build these taxonomies, and then when any data comes in -- transactions, social status updates, images and videos, blogs and Web updates, check-ins and locations -- we mark them with what they're talking about on this taxonomy. So we take this unstructured data, and we annotate it so we know what it's really talking about in terms of these taxonomies that we've built so we can give structure to the data.
Then comes the most interesting part, using the tagged, annotated data (whether that is transaction data or social data) to build a dashboard for product managers and business users so they, not the data scientists, can come up with the insights and ideas. From a dashboard, for example, a business manager could see that zombie parties are trending in San Francisco, while handmade costumes are popular in New York City. "The fact that event themes can drive what we sell in different areas doesn't have to be something that the data scientist has to come up with. The business owners can drill down into these verticals across different dimensions and see what's out there."
Walmart Labs' sample (and simplistic) Social Genome Platform dashboard.
Walmart Labs is well on its way to making this sort of scenario a reality. But Lamba says it's not easy. Besides the technology, of course, there are people issues. Not every domain expert wants to or should be working with big-data. But identifying those who do, and can, means more power to the business -- and that's good no matter what holiday season is upon us.
@Beth: When ten years ago I met the concept of faceted navigation in Marti Hearst's papers, I have been impressed by the similarity of faceted navigation with classical multidimensional OLAP approach, as analyzed in many accademic papers (for example, www.siam.org/proceedings/datamining/2009/dm09_103_zhangd.pdf or nadav.harel.org.il/papers/p33-ben-yitzhak.pdf ). But I have always been puzzled by the difficulty for this concept to gain ground in the Enterprise, where only numerical and structured data were analized in OLAP cubes, while faceted search was, at best, used by information architects to navigate intranet pages; missing completely, apart few experiences, the potentiality of analyze contextually numerical and textual informations. Maybe now is the right time, as the WalMart project suggests, when social media data are approaching the volume of traditional numerical data. Recent examples, http://iswc2011.semanticweb.org/fileadmin/iswc/Papers/Research_Paper/12/70310001.pdf, testify the interest in the concept, but the approach is still quite traditional. I haven't personal experiences of large scale projects geared toward real-time facets extraction from social streaming, but I suppose that good canditates for an implementation stack could be Kafka, Storm, Lucene and Bobo.
@Andrea Scarso, thanks for jumping into the conversation. Let me ask, where have you seen "faceted search meets social big data" on a smaller scale? (I'm not implying it hasn't been, just curious if you can point us to some cool examples.)
Agreed, the niche suppliers are going to be on very shaky ground. I know someone who has done business with WalMart in the past, it's been tough to get on the shelves in the first place for a long time. I can only see that getting more difficult when they are looking at trending products more closely.
I also think small businesses that have managed to gain shelf space at WalMart will have a potentially shorter window of opportunity to prove themselves in sales. I can see some losing shelfspace because Wal Mart determined that the vendor's product did not meet sales targets. I am not fully familiar with the process, but that's my first immediate thought. I am open to other thoughts on this.
@Callmebob, agreed, although WalMart has been better at collecting data for a long time this new platform seems to be turning them toward competing more closely with small specialty shops that stock items becasue customers asked for them or it's what the operator is "into". This lets WalMart be as flexible as the smaller shops which was one of their last saving graces.
@callmebob--I do admit the word "monopolistic" comes to mind sometimes when I hear of some of the advanced big-data analytics work retailers and others are doing.
@callmebob, true. But wouldn't you say that even without the Social Genome Platform Wal-Mart already dominates the small fry retailer on data analytics anyways?
If small retailers didn't have enough to worry about, WalMart Labs with their Social Genome Platform can dominate them on the big data analysis front. Their dashboard gives their business managers the ability to whittle down their merchandising assumptions and increase their home run capabilities. The small fry retailer won't have that luxury and find it easier to be a trend follower if they can identify it. And following WalMart does not constitute a winning business formula.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
To save this item to your list of favorite AllAnalytics content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.
Elizabeth Barth-Thacker, a BI and informatics technology manager at Humana, tells us how her team is creating data transparency and building engagement with the business – with the help of an internal collaboration portal called Humanalytics.
Speaking at SAS Global Forum Executive Conference, Rajeev Kaul, SVP of pricing at OfficeMax, uses a Chinese proverb to explain one of the reasons he's deploying SAS Visual Analytics.
In an All Analytics interview, Mike Cavaretta, technical leader, predictive analytics at Ford Research & Advanced Engineering, shares how big-data is fueling vehicle decisions.
Analytics professionals and SAS executives share how organizations can get on with their work so much faster when working in a high-performance and visual analytics environment.
Analytics professionals who attended SAS's recent Executive Briefing in New York share how they think visual analytics might help their organizations get better value from data.
At Boeing, effective decision making comes down to this simple formula: QxA=E, as executive Jerry Allyne explained at the recent INFORMS analytics conference.
Whether working in major league sports, financial services, or healthcare, analytics, and data, professionals are checking out how visual analytics and high-performance technologies can help them optimize their environments, shrink their cycle times, and improve decision making, as attendees at the recent SAS Executive Briefing in New York share with us.
SAS CEO Jim Goodnight speaks with us at a recent SAS Executive Briefing about getting a feel for what's in your big-data and other new realities powered by advanced analytics.
Jim Davis, SVP and CMO at SAS, talks with us at a recent SAS Executive Briefing about how high-performance analytics and visual analytics take away the concerns over big-data and let companies get down to business with their data.