Big Data: Let's Stay Focused

I fear we are drifting back to the old debate about whether size matters. Wait, I'm talking about big data and analytics. What did you think I meant?

Credit: Pixabay
Credit: Pixabay

Maybe because it's summer, writers, editors, researchers, and educators are feeling lazy, and it's easier to latch onto a buzzword than to specify something such as predictive analytics or visualization. Lump anything involving data under big data, and you think you have it covered.

Big data is helping Olympic athletes win in Rio. Big data is helping insurance companies make money. Big data is curing rare diseases. No, in most cases it's analytics, and in many cases -- maybe more than ever -- the data can be fairly small.

Blogger Charlotte Erdmann had a nice article yesterday, Big Data: An Art Form in Itself?, about using lots of data from all over the world to build out visualizations that aren't just pretty but tell a story. For example, one artist drew on many data sources to find relationships among victims killed on September 11. That's big data.

Tracking an athlete's performance under certain weather conditions or against certain opponents isn't big data. It's analytics. Remember Doug Laney's three Vs: volume, variety, and velocity.

Plus, big is relative. If you have never had data available to you beyond what you can lump into Excel's auto-sum function, then a few thousand data points melded together from two sources seems big. What it is likely to really be is darn useful, but not big data.

If a healthcare provider pools biometric data with behavioral data to identify predictors for depression, that's big data; lots of data from multiple sources that changes over time.

However, there are plenty of ways that healthcare provider can utilize analytics without getting into big data's data quality issues and basic logistical challenges. In most cases, those applications focus on identified goals.

For example, it's analytics if the provider looks at the average length of hospital stay and common treatments for certain medical conditions. That provides useful data and can be accompanied by an action item that involves closer looks at the reasons behind outliers, such as a patient whose stay was twice as long as average and why another patient was discharged much sooner than average.

The above scenario becomes a case for big data when -- as some providers are doing now -- you add in non-medical data sources such as home, family, and economic factors for each patient. In that case, you are looking for causes that can be addressed.

For all types of business, not just healthcare, an inordinate focus on the "big" in big data leads them to lose sight of the business problems they need to solve. They acquire, and have to secure for years, data that will never be used. They dream of a common enterprise data lake that will support hundreds of diverse business unit applications. They build data bureaucracies. They spend so much time thinking big that they forget the reasons they built those applications in the first place -- to run the business better, to serve the customer, and to sell more goods and services.

It's nice to read case studies about successful companies that have invested decades of work into complex enterprise-wide projects that, in total, represent big data. However, you aren't likely to duplicate that in a mere six months, a year, or three years. While you are sitting in meetings spinning up some grand scheme for bigger than big, some simple business problems will be begging for answers now.

So stay focused on solving problems with analytics and then, only when you are ready, tack "big" onto "data."

Is your company focused on solving business problems? How do you keep that focus?

James M. Connolly, Editor of All Analytics

Jim Connolly is a versatile and experienced technology journalist who has reported on IT trends for more than two decades. As editor of All Analytics he writes about the move to big data analytics and data-driven decision making. Over the years he has covered enterprise computing, the PC revolution, client/server, the evolution of the Internet, the rise of web-based business, and IT management. He has covered breaking industry news and has led teams focused on product reviews and technology trends. Throughout his tech journalism career, he has concentrated on serving the information needs of IT decision-makers in large organizations and has worked with those managers to help them learn from their peers and share their experiences in implementing leading-edge technologies through publications including Computerworld. Jim also has helped to launch a technology-focused startup, as one of the founding editors at TechTarget, and has served as editor of an established news organization focused on technology startups and the Boston-area venture capital sector at MassHighTech. A former crime reporter for the Boston Herald, he majored in journalism at Northeastern University.

Big Data Success Essentials: Tech, People, and Process

While there is an increasing focus on the role of people alongside technology in analytics initiatives, let's not forget that process -- business rules -- play an important role in big data success.

Why You Should Remember Equifax

The Equifax breach raises multiple concerns about how companies respond to hacks, but also how they handle third-party consumer data.

Re: The complexity incentive
  • 8/17/2016 11:19:51 AM

It's actually extremely dangerous.

Re: Chicken-egg
  • 8/17/2016 11:19:03 AM

Which is routine for terminology in the IT industry, which is driven by buzz-marketing, not sound analysis and science, its pretenses notwithstanding.

Re: The complexity incentive
  • 8/17/2016 11:12:24 AM

It does seem that complexity is the trend, it gets people working more, sells more products, and keeps people busy. But, whether it's cost effective and leads to enough future gains is still the real question, and there's the luck or random effects to consider in just how simple or complex we make things in order to make a future better.

Re: Chicken-egg
  • 8/16/2016 2:06:50 PM

Terry LOL, like so many buzzwords Big Data has suffered from overuse and abuse. The word big data can mean so many things on so many levels but often times people use big data to sound current and trendy rather than just describe the business problem and data based solution.

Re: Chicken-egg
  • 8/15/2016 4:19:18 PM

Technology is just a means, but one that is extremely effective in the context of a gap between complexity and human ability to know and understand increasingly reduced by elimination of education and its replacement with vocational training. The temptation is too great because it makes it easy to complexify and the target is disarmed and less able to defend itself.

  • 8/15/2016 3:58:40 PM

It's true of many sectors: People conspire to ensure they don't get cut of the equation.

So which came first? Our dependence on technology, or IT's need, conscious or not, to complicate the equation to make themselves more indispensable?


Re: The complexity incentive
  • 8/15/2016 12:50:45 PM

In fact, the opposite is true: there is intentional complexification in an effort to demonstrate "scientific" quality, in many cases to obfuscate the lack of science, often with an agenda--commercial, political.

With the added benefit of collapsed education, quackery has a field day.

Re: The complexity incentive
  • 8/15/2016 12:38:36 PM

@ dbdebunker - Good point.  If technology really solved the problem there would be no incentive for upgrades and so on.   

The complexity incentive
  • 8/12/2016 1:53:35 PM

In the preface of one of my books I once wrote that there is an incentive for complexity in the IT industry. Aside from the fact that simplicity does not sell so much training, consulting, books, seminars and software, it is much easier to sell anything complex when people have no hope of understanding it enough to judge whether it makes sense or not.

As I've argued in my writings, including my posts here, this is one explanation for the increasing complexity, of which Big Data, "data science" and machine learning are cases in point.

It used to be the case that models were simplifications of complex reality such that people could understand certain aspects of it. No more.