High-Performance, Big Data Analytics Takes a Certain Understanding


In a recent Enterprise Strategy Group study, we asked participants to identify the biggest drivers for deploying a new data analytics platform for big data requirements and to tell us the greatest benefits they expected to realize as a result.

We found this: Organizations are looking at new data analytics platforms as a way to cut costs. But their expectations aren't centered on completing analytics on larger datasets closer to real time. Rather, "improving business agility" won the top spot. And that is exactly what high-performance computing (HPC) and big data analytics platforms such as Hadoop and massively parallel processing (MPP) analytical databases are advertised to deliver.

When you can support high-performance data analytics across your infrastructure, you can discover patterns or develop theses for predictive models in hours and days rather than weeks and months. Likewise, if you need to be able to execute a ranking algorithm in near real-time to identify a fraudulent activity, the model needs to be able to complete in microseconds to milliseconds. Delivering results in such speedy fashion means giving the business the ability to react quickly and with the agility that's much desired.

But you need to know what you need out of a high-performance, big data analytics platform.

With big data analytics platforms based on Hadoop, there's the expectation that high performance is relative -- meaning seconds, minutes, and hours versus the milliseconds, seconds, and minutes of the MPP analytical databases. More real-time analytics with high performance and high volumes that fall into HPC are really an extreme case requiring more strategic investments. If you can wait a few more seconds for an answer, Hadoop may be good enough for a fraction of the cost.

Either way, big data architectures aren't without their red flags for enterprise deployment.

Hadoop, for example, is still emerging and has a few holes when it comes to security, data protection, and high availability. There are workarounds, and not all applications will need advanced enterprise-class data management requirements. Yet, as Hadoop-based applications become mainstream, those requirements will absolutely surface. Until then, you will need to look at your IT Infrastructure Library, or ITIL, processes and make some exceptions for supporting a big-data analytics platform.

Also, if your network is not set up to move massive volumes of data from point A to point B in the timeframe the business requires, an upgrade in your network infrastructure may be on the cards.

What's more, any Hadoop or MPP analytical database-based application will require specialized skillsets that are already in short supply. You better have the budget allocated to either hire specialists or send your top developers to training.

Julie Lockner, Senior Analyst & VP, Enterprise Strategy Group

Julie Lockner is a leading industry expert in the structured data management market. As Senior Analyst and Vice President covering data management solutions for ESG, her focus areas include database management systems, data warehouse, business intelligence, data analytics, data integration, and, most recently, all things "big data." Prior to joining ESG, Julie was President and Founder of CentricInfo, a consulting firm that specialized in helping organizations find value in their information assets through implementation of data governance policies, datacenter optimization, and data analytics focusing on aligning business and IT. ESG acquired CentricInfo.

Beware Phantom Big Data Projects

Big data requires nontraditional IT approaches -- remember that in your analytics considerations.


Tight integration of tools and resources
  • 2/22/2012 2:17:14 AM
NO RATINGS
1 saves

I think for analysis along with platform, tools also matters. Whatever may be the platform, the ability of tools to interactive and make use of the underlined resources are important. If the tools are not compactable with the deployed hardware and associated software, then the analysis process may happen with lesser speed, there may be always certain bottlenecks.

Re: Link between HPA and automation?
  • 2/21/2012 10:10:25 PM
NO RATINGS

Julie writes


if you need to be able to execute a ranking algorithm in near real-time to identify a fraudulent activity, the model needs to be able to complete in microseconds to milliseconds. Delivering results in such speedy fashion means giving the business the ability to react quickly and with the agility that's much desired.

One would think credit card companies would be big users. I wonder approximately what the percentage is of credit card companies that are delopying this technology (HPC analytics etc.). I'd also wonder if this enables some more technologically forward-thinking companies to react to fraud faster than others.

Re: Link between HPA and automation?
  • 2/21/2012 8:42:25 PM
NO RATINGS

Hi Seth,

Good strategy, in fact, but I'm imagining this is already something companies generally do when deploying big data analytics.

Re: Link between HPA and automation?
  • 2/21/2012 6:51:42 PM
NO RATINGS

@SethBreedlove:

" high performance is relalitve"..."What makes one company agile isn't the same for everyone. "

High performance evaluation may indeed depend on the company's data processing needs and requirements. Companies should first try to audit and understand such needs in oder to choose the right analytics tools.

Re: Link between HPA and automation?
  • 2/21/2012 4:45:11 PM
NO RATINGS

I guess it would only make sense high performance is relalitve.  that different companies would have different values on saving costs, performance and speed. 

As mentioned in the article, banks need answers in milliseconds when it comes to fraud prevention, while Procter and Gamble would have very different needs. 

What makes one company agile isn't the same for everyone. 

Re: Link between HPA and automation?
  • 2/21/2012 2:49:13 PM
NO RATINGS

Great example, Julie, and, no doubt, another payoff for the overall investment!

Re: High End Server versus HPC's
  • 2/21/2012 2:39:18 PM
NO RATINGS

Hi Louis, This environment had both an HPC and data warehouse environment. The team wanted to be able to enhance the DW with more detailed data and have the HPC environment have access to it.  This new architecture added Hadoop as the landing place for detailed data that was accessible to both environments.  Once they found how much quicker it was to add data elements tot he model, that was just the beginning of new opportunities.

High End Server versus HPC's
  • 2/21/2012 12:45:01 PM
NO RATINGS

Hi Julie,  Was this difference in job processing time in an Oracle World versus Hadoop due to HPC's  alone ?  Or were there any other changes ?

Are these high-end systems on the backend or HPC's ? 

Re: Link between HPA and automation?
  • 2/21/2012 12:14:14 PM
NO RATINGS

@JulieLockner,  fewer false positives and quicker response time certainly would be ideal in fraud detection analytics! I'm curious, outside of latency-sensitive operational analytics are there some areas where companies just aren't willing to get quite as automated with their responses as they might?

Re: Link between HPA and automation?
  • 2/21/2012 11:42:45 AM
NO RATINGS

Hi Shawn, From an agility perspective, one Hadoop user had a challenge with their ability to add attributes to their Cognos cube for a risk model.  The job took 4 days to complete with Oracle.  When they switched the underlying data store to Hadoop, they were able to reduce that rebuild to 1 day.  From a new opportunity perspective, another user was not able to support their business users' request to mine detailed transactional data to identify opportunities to improve business process efficiencies because the warehouse could not scale.  The warehouse was only designed to look at aggregate data.  By opening up detailed transactional data cost affordably on a Hadoop cluster, they could investigate as to why processes were inefficient.

Page 1 / 2   >   >>
INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +