In a recent Enterprise Strategy Group study, we asked participants to identify the biggest drivers for deploying a new data analytics platform for big data requirements and to tell us the greatest benefits they expected to realize as a result.
We found this: Organizations are looking at new data analytics platforms as a way to cut costs. But their expectations aren't centered on completing analytics on larger datasets closer to real time. Rather, "improving business agility" won the top spot. And that is exactly what high-performance computing (HPC) and big data analytics platforms such as Hadoop and massively parallel processing (MPP) analytical databases are advertised to deliver.
When you can support high-performance data analytics across your infrastructure, you can discover patterns or develop theses for predictive models in hours and days rather than weeks and months. Likewise, if you need to be able to execute a ranking algorithm in near real-time to identify a fraudulent activity, the model needs to be able to complete in microseconds to milliseconds. Delivering results in such speedy fashion means giving the business the ability to react quickly and with the agility that's much desired.
But you need to know what you need out of a high-performance, big data analytics platform.
With big data analytics platforms based on Hadoop, there's the expectation that high performance is relative -- meaning seconds, minutes, and hours versus the milliseconds, seconds, and minutes of the MPP analytical databases. More real-time analytics with high performance and high volumes that fall into HPC are really an extreme case requiring more strategic investments. If you can wait a few more seconds for an answer, Hadoop may be good enough for a fraction of the cost.
Either way, big data architectures aren't without their red flags for enterprise deployment.
Hadoop, for example, is still emerging and has a few holes when it comes to security, data protection, and high availability. There are workarounds, and not all applications will need advanced enterprise-class data management requirements. Yet, as Hadoop-based applications become mainstream, those requirements will absolutely surface. Until then, you will need to look at your IT Infrastructure Library, or ITIL, processes and make some exceptions for supporting a big-data analytics platform.
Also, if your network is not set up to move massive volumes of data from point A to point B in the timeframe the business requires, an upgrade in your network infrastructure may be on the cards.
What's more, any Hadoop or MPP analytical database-based application will require specialized skillsets that are already in short supply. You better have the budget allocated to either hire specialists or send your top developers to training.