Today we hear a lot about the three "V's" of big data -- volume, variety, and velocity. Add in continuous analytics and intensified processing requirements, and the challenges haven't changed all that much in the past 10 years.
Doug Laney ought to know. In 2001, while at Meta Group (now Gartner), he posited the three V's as part of a discussion about the move toward centralizing data warehousing. His report carried the title, 3D Data Management: Controlling Data Volume, Velocity, and Variety. Today Laney is still studying data management issues, now as vice president of research for business analytics and performance management at Gartner.
In a phone interview last week, Laney told us that the type, schemas, location, and context, among other factors, of data are more at issue than ever. In a world of big data -- unstructured and non-relational -- decisions revolve around, not simply how to collect basic data and where to store it for easy access, but also in what format, for how long, and with what priorities.
The granularity of the data companies are collecting and measuring has become an issue in the ever-evolving field, too, Laney said. Sub-transactional, or data that occurs between transactions, is one major example. While once a retailer might have only measured transactions or, as they are sometimes called in the Web analytics world, conversions, it now measures interactions among those transactions, for example. The goal is determining what other engagements might have occurred that could positively affect future transaction volume.
Even as companies increase the variety of data they're collecting, especially to learn more about customer and user behavior, they can go even deeper yet. Laney uses Chico's and its White House/Black Market womens boutique clothing store as an example. As we've discussed on AllAnalytics.com previously, Chico's collects and tracks many kinds of data about its customers to great advantage. But you know what it doesnt track? Husbands!
Chico's could clearly add another dimension here to gain even more insight into household purchasing decisions, Laney suggests.
Finally, while many companies discuss data increasingly as an asset and treat it that way, the industry has still made no clear attempt to value it. After the 9/11 terrorist attacks, for example, many companies discovered that they had no insurance for the data -- corporate assets -- they'd lost. From the insurance companies' perspectives, their data files might as well have been empty, Laney says.
Today, huge data-driven companies like Facebook, seeking to go public, still base their valuations on traditional measurements like physical assets, debts, and predicted earnings rather than on the immense amount of data they possess and control. Business has to rectify this situation, which is the focus of industry research, he noted.
As the three V's and other data management challenges evolve, the notion of "data scientist" does as well. But what, exactly, is a data scientist?
Laney will answer that question today at 1:00 p.m. ET, when he joins the All Analytics community for an instant e-chat on data scientists and their roles in the emerging era. What knowledge sets should a data scientist possess, where can companies find such specialists, and where do they fit into the organizational chart? You can join the e-chat here.