What is it that gets you thinking about big data and how it will affect your business, if not the world in general?
For Tony Jewitt, a data management industry veteran who's now tasked with helping Fortune 1000 companies address their big data challenges, the "it" that gets him ticking is sensor data. This is the here and now, real-life use case for big data, he says.
"Think about it. Humans can create information, but we sleep a certain amount of hours a day. Sensors don't. This one will really cause volumes that will require new technologies."
Jewitt, vice president of big data solutions at enterprise Web and search consulting firm Avalon Consulting, likes to use three examples to prove his point. He shared those with AllAnalytics.com in a recent interview:
- Electric companies replacing the "one number gathered manually once monthly" programs with smart meter readers that gather electrical usage information from consumers every 15 minutes.
- Car insurance companies using vehicular motion sensors to gather data -- like frequency and speed of lane changes -- anytime one of their insured drivers set on down the road.
- And, for Web marketers, PC software tracking a user's eye movements across the screen and changing advertising based on those patterns.
"These are huge multipliers, all showing the power of data," Jewett said.
Of course, sensor data isn't alone in driving big data convulsions throughout corporate America, he adds. Companies of all sorts are facing this new data management reality: The 30-year-old relational database management system (RDBMS) just can't cut it any longer.
In general, Avalon enterprise clients "know it's out there, know it's coming, know it's real, and know it's going to be useful. But, they're mostly in research mode. How is this going to apply to us? How will it save money? Where are the use cases?"
Take the real case of an Avalon client, a large high-tech manufacturer testing data for new products. The data is loaded in a massive relational database, all ready for analysis. But that process is getting too slow and too expensive. And so this testing group finds itself asking the question, " 'How can I do this faster, cheaper, and, by the way, can I use Hadoop or one of the other 20 things out there rather than just going to the cloud using Cloudera or Amazon Elastic MapReduce, which brings a whole other issue being outside the firewall?' "
The data management task -- analyzing testing data, insurance claims, or what have you -- is not new. "But over the years companies are getting smarter and smarter and know how to collect more and more information. And that's getting slow and expensive."
These examples might not be as clear as the sensor data scenarios, but as the RDBMS bogs down they are big data problems to solve, Jewitt says. "People are running around for point solutions, making sure that they're going to scale."
But as anybody who's heard ad nauseum about the mainframe's impending demise (first from midrange, then departmental, then desktop computing), don't expect companies to ditch their old architectures when they come to their big data realizations. "In the near term, for the most part, nobody will shut down anything," Jewitt says.
Rather, what Jewitt says he anticipates is that companies will open shared-services Hadoop processing centers for "departmental experimentation." By tossing processes into these experimental Hadoop environments, companies will be able to see which of their processes might be candidates for migrating to a big data platform -- as long as it'll let them be faster and cheaper, and give them more flexibility.
How's your RDBMS faring against today's data volumes? Share your pain points on the message boards below.