Could it be that, as new data formats arise and we face the constant need to come up with new analytics capabilities, we're falling behind? Instead of finding a more persistent and long-term solution to the data fragmentation dilemma, are we generating more and more one-off solutions?
The problem lies in the fact that most of today's analytics platforms are "static," as Ravi Kalakota, a managing director of the global professional services firm Alvarez & Marshal, discusses in this blog post.
The dominant design of such platforms is dependent on a specific set of questions and dimensions. However, as businesses and individuals are evolving, they face a mushrooming cloud of fragmented information they don't know how to use, along with a shortage of qualified personnel to handle it. Their in-house or third-party platforms aren't flexible enough to meet their needs, so they buy or develop different analytics platforms -- platforms that often overlap in data and purpose (particularly in the social media space). In particular, platforms fail to allow ad hoc exploration using real-time data feeds. If they do allow for such exploration, the process is time consuming and costly.
Beginning in 2010, we've seen many acquisitions in the social analytics and cloud data space, but there hasn't been much in the way of creating a more dynamic approach to analytics platforms, a la the analytics-as-a-service concept discussed in this Sand Hill blog.
In the emerging media space where I'm most focused (through client work and the social media measurement and social media for the arts courses I teach at University of California-Irvine Extension and at Rutgers University, respectively), I've come across three constructs that explain the analytics fragmentation we face and suggest possible solutions we can explore.
- Dark data: This is the data organizations generate but don't understand how to optimize or use, so they aren't drawing much insight from it. As Wired Magazine wrote several years ago, "Freeing up dark data could represent one of the biggest boons to research in decades, fueling advances in genetics, neuroscience, and biotech." But unless we have questions to apply against the data, it may make no difference what the data is. Rather than focusing on the data, maybe our focus really belongs on the nature of our questions and how our data can help answer them. Dark data might be worth exploring.
- Dark social: This aspect of dark data is related to social media activity that isn't easily traceable to the actual origin point, as the Atlantic recently discussed. Web analytics is limited in what it can capture, and it has to be instrumented deliberately to capture much of anything specific, which leaves out much of social data activity. In addition, the privacy settings in Facebook, LinkedIn, most forums, and some photo-sharing and blogging sites make sharing social data across analytics applications difficult. Attempts to solve the problem have simply generated more analytics fragmentation.
- Ultraviolet data: I coined the term "UV data" in 2010, when I was among the first analysts to recognize a problem that analytics platforms weren't designed to solve. That problem: Businesses aren't equipped to capture much of the data they need, but data is present nonetheless, and it is waiting to be collected. It's quite possible the uncollected data is more valuable than the collected data. I've devised an audit process for discovering UV data, and I am figuring out how best to collect and organize it.
No doubt, all the forms of dark data, dark social, and UV data boil down to the same thing -- what we don't know might be as important as what we do know, if not more. It also explains why many analytics platform fail to drive the needed insights: They're based on static dimensions and haven't been built to answer the questions that we must increasingly solve in order to succeed.