Data Virtualization: An Idea Worth Your Time

If you had to characterize your organization's relationship with data, would you say it's one of trust or fear? That's a critical question for these times, framed as they are by the imperative to establish data-driven decision making as the business norm.

More critical yet is the same question viewed from an end-to-end perspective. It's not enough that the marketing organization trusts its data and sales has confidence in its data. That will only get you so far... and not very, at that. Rather, each department and business unit needs to accept the trustworthiness of any and all corporate data.

I'm not telling anybody anything new here (or, at least, I sure hope not). The smart, modern company encourages information sharing through open communication and collaboration, understands data to be an asset and the value of integrating that data, and knows the importance of making decisions in real-time based on data, not gut. That calls for a lot of trust.

But all too often, the opposite rules the day. Fear of one another's bad data hinders interdepartmental collaboration and information sharing. Nobody in their right mind wants to send their good, clean data into the wild, only to be lost or corrupted. After all, cleaning up and getting data in shape for business reporting is no small effort.

Traditional IT infrastructure and processes offer little comfort, either. Can you trust IT to deliver timely data -- as in, at the moment of customer contact, if required? Chances are pretty high that you can't, considering the data enters IT's domain from any number of autonomous systems, sits in a legacy enterprise data warehouse, and gets pulled together for your report through batch processes that run on... and on.

Maybe things aren't quite so bad at your company. Maybe data sharing among disparate business users is common. Fantastic. But is the reporting cohesive? If sales does its thing with marketing's data -- aggregating and analyzing it and presenting it in reports -- those results might not look quite the same as those produced when marketing aggregates, analyzes, and presents the data itself. Examples of successful data integration are few and far between.

Data challenges of this sort may seem insurmountable -- easier to continue working around than addressing head on. But if you can build a centralized, controlled data layer, that doesn't have to be the case. Lest you think that's not possible, consider the story spelled out in a new case-study e-book, "Federated Data Checks In," from SAS, this site's sponsor (registration required).

I could leave off with a cliffhanger here, but I won't. The e-book tells how Westwood Vacations, a fictitious hospitality company that could easily be any number of actual businesses, used an emerging technology called data virtualization to solve some of its biggest data challenges. As depicted below, data virtualization software establishes a controlled data layer between batch and real-time data sources and consuming applications. This ensures "that all business units are accessing and processing the same data with consistent formats and processes."

How Data Virtualization Works
(Source: SAS e-book, 'Federated Data Checks In')
(Source: SAS e-book, "Federated Data Checks In")

You could consider it a "data-as-a-service" approach, as SAS does with its SAS Federation Server. Version 4.1, shipping this quarter, provides access to big-data resources like Hadoop and SAP HANA, traditional databases like Oracle and DB2, and other data sources, as the company announced last month. The aim is to provide "easily consumable access to shared, secure enterprise data to speed and simplify data preparation." Toward that end, SAS also said it has enhanced Federation Server's security, data masking, and data governance capabilities to "ensure proper policies, access and restrictions for sensitive data."

I'll leave the nuts and bolts of how data virtualization works for another post. For now, I'd rather circle back to that modern enterprise and how the technology works for it. With data virtualization, this business:

  • Shares information, with disparate data marts provided through virtual views and reports prepared using centrally defined business terms -- without duplication
  • Collaborates, without fear of losing or corrupting data, since it's accessed through the virtualization layer and not from the source system directly
  • Is data-driven, with trust established via the virtualization layer

Were I at a company struggling with trust issues and other data management challenges, data virtualization is a technology I'd like to check out. What about you?

— Beth Schultz, Circle me on Google+ Follow me on TwitterVisit my LinkedIn pageFriend me on Facebook, Editor in Chief,

Related posts:

Beth Schultz, Editor in Chief

Beth Schultz has more than two decades of experience as an IT writer and editor.  Most recently, she brought her expertise to bear writing thought-provoking editorial and marketing materials on a variety of technology topics for leading IT publications and industry players.  Previously, she oversaw multimedia content development, writing and editing for special feature packages at Network World. In particular, she focused on advanced IT technology and its impact on business users and in so doing became a thought leader on the revolutionary changes remaking the corporate datacenter and enterprise IT architecture. Beth has a keen ability to identify business and technology trends, developing expertise through in-depth analysis and early adopter case studies. Over the years, she has earned more than a dozen national and regional editorial excellence awards for special issues from American Business Media, American Society of Business Press Editors,, and others.

Midmarket Companies: Bring on the Big Data

The "big" in big data is no reflection of the size of the organization embracing its potential.

Push Yourself to New Analytical Discoveries

Take inspiration from Christopher Columbus as you pursue your analytical journeys.

Re: Creating a Framework
  • 5/6/2014 10:32:16 PM

Thanks Beth.  I strongly encourage everyone to read the article Beth linked us to: Federated Data Checks In,  It is very interesting and well written. (Though, I hope they changed the names because the kind of kick around the old CFO).  The report makes so many good point to go into, that it is difficult to pick out just one. 

Re: Creating a Framework
  • 5/6/2014 6:02:12 PM

We did not talk about Data Virtualization during that session, but at Avalon, we view it as one of the biggest challenges facing a company attempting to become an information driven enterprise. We have worked with tools like Marklogic, Hadoop and other NoSQL databases, as well as applying practices like semantics, metadata managment and data governance, in the area of Data Virtualization.  

Re: Creating a Framework
  • 5/6/2014 5:20:03 PM

Excellent advice, Wayne -- as you of course shared during the recent All Analytics Academy session you oversaw, Mastering Big-Data Management (available to listen to on-demand for anybody who might have missed a great lecture!). I recall you addressing the importance of defining business terms -- a "rule of one," right? But I don't recall you talking about data virtualization at all. Do you guys work with data virtualization at this point at all? 

Creating a Framework
  • 5/6/2014 5:05:56 PM

Hi Beth,

Enjoyed your article. It makes some excellent points. To add to what you have already said their is some basic things an organization can do to make the process easier. 

The first thing is understand what the most important information for a business function need or wished it had for decision making. 

Next look for where they are getting that data. Then look across all of the organization and see who else is requesting the data with that they call by the same name. Where are they getting it. 

Notice that at this stage of the process I haven't gotten into the data definitions. But only identifying the most important data and the groups that have a vested interest in using it across the enterprise.

You can do this very quickly, and start to get the parties with a vested interest in a particular data set into the same room and drive to agreement on definitions (or at least agree to call similar things by distinct names. 

By doing this like to like comparison and data governance, a data virualization project will have a greater chance of success.

Avalon Consulting, LLC developed its Analytics Framework (;jsessionid=492A9E5DBF3625C0259A834A93A0AF11) offer to speed organzations through this process.