It Sure Is Noisy Around Here

In case you are wondering, yes, I am reusing the same graphic from last week, but this time to make a contrasting point. Last time, in The Skeptical CFO, I was focused on the variability in the data and the forecast.

This week, we're headed in the opposite direction, looking instead for the commonalities, the consistencies, and the patterns. I introduced the concept of How Much, How Soon, How Certain some time ago, and since then I've focused primarily on the most neglected component of that trio, "How Certain," or put more straightforwardly -- risk.

"How Soon," which comprises NPV, IRR, and the time value of money, is better covered in other forums. But sometimes you just want to focus on the deceptively simple element of "How Much" -- What is it, and how big is it?

Consider once again the graphic to the left. Last time, we were concerned with the variability of the several ovals -- what shape, what angle, how wide, how dense. If I used the center of each as my "best guess" or the major axis as a trend line, how far off might my forecast or business decision be? How bad could it get?

Today, however, I want to consider the fact that there are ovals at all. Where did they come from? And the colors? Even the axes -- they aren't arbitrary. Imagine this graphic without the ovals and without the coloration -- just a plot of black dots on a white grid.

Noisy. Tough to see the forest for the trees. It's just a lot of stuff, isn't it? Kind of a tilted "T" shape to the stuff, but does that even mean anything? Now, put the colors and the ovals back in -- Wow! There's a pattern! Several, in fact. Clusters. Add some axis labels and a legend, and you are in business.

Even here, though, I cheated a bit -- we started with a graphic/plot. Initially this was just data in a database, or equally likely, rows and columns in a spreadsheet. There's what, 150 or so data points here, each with perhaps a magnitude, an X and a Y coordinate, and some related attribute that eventually got translated into "color." Four columns and 150 rows in a spreadsheet.

So you do the obvious and sort by the attribute in column A. Now what? Do you see clusters of roughly 25 each in some sort of two-dimensional relationship with each other, let alone the "T" pattern? Even with Ted Williams' 20/15 vision, you're not going to get much more insight out of that worksheet.

But when you can apply some analytics to the data, and then consume it with some helpful visual clues such as color, size, shape, shades, labels, axes, legends, and so forth, insight just jumps out at you. Not sure quite how to do that?

No problem -- that's what SAS Visual Analytics has been designed for. It creates a virtual data sandbox that you can query and play around in to determine the appropriate visualization based on the underlying raw data -- a way to reduce the noise. (Visualization? Noise? Sometimes a mixed metaphor actually works out!) Its use of Autocharting displays the most appropriate visual when you drag and drop any combination of categories and measures onto the visualization pane.

Getting back to the matter at hand, the "How Much" question, sometimes you need to discard the outliers and look past the exceptions that give you all that risk trouble and just concentrate on the big picture. Sometimes you need to filter out the noise and get down to fundamentals. Some basic analytical techniques are all you need to turn an otherwise random looking spreadsheet into tangible and actionable information:

  • Clustering (as described above). Identify common attributes, and discover areas of critical mass in your production, supplier, or customer data. Hierarchical clustering can generate insight into important subsets of your main groupings that you might have otherwise overlooked.
  • Market Basket analysis. Similar to clustering and familiar to most B2C marketing functions, this tells you what products should be bundled or which products sell best to particular customer types.
  • Sequence and path analysis. Which shared resource costs or operational activities comprise most of the cost of your best selling products? How do your customers navigate your website, where do they get lost. and when are they most likely to end with a purchase?
  • Mapping. Often, simply representing the data spatially can lead to great insights, such as correlations between specific attributes and physical, political, or cultural geographies. Modern epidemiology was born with Dr. Snow's mapping of cholera outbreaks (right) correlated with contaminated community water sources (neighborhood wells/pumps), a profound insight from a simple technique.

Often just being able to redisplay the data in different graphical formats can be extremely helpful, and even more so when the different formats can be displayed side-by-side simultaneously. What was completely obscured in a spreadsheet may become slightly more evident in a bar or pie graph but smack-me-upside-the-head-duh when properly segmented or displayed as a heat map. The basic differences between mean, median, and mode may seem merely academic if the data is presented in nothing more than a table, but could significantly change the decision outcome when the team gets to internalize the graph and what it implies visually.

This is what the combination of analytics and visualization does best -- together they filter out the noise so that you are left with the core concerns. Decision making under uncertainty is tough enough -- no sense wasting time and effort striving for precision and accuracy around the wrong variables or issues. No matter how you choose to mix your metaphors, data visualization turns down the noise so that you can hear yourself think.

Is it just me or did it suddenly get quieter? Did you hear that insight? Did you see that insight?

[If you are interested in further exploring the topic of noise in the decision making process, have some fun with this TED Talk by Daniel Wolpert on "The Real Reason we have Brains".]

This originally appeared in the SAS Blog Valley Alley.

Leo Sadovy, Performance Management Marketing, SAS

Leo Sadovy handles marketing for Performance Management at SAS, which includes the areas of budgeting, planning and forecasting, activity-based management, strategy management, and workforce analytics. He advocates for SAS’s best-in-class analytics capability into the offices of finance across all industry sectors. Before joining SAS, he spent seven years as Vice President of Finance for Business Operations for a North American division of Fujitsu, managing a team focused on commercial operations, customer and alliance partnerships, strategic planning, process management, and continuous improvement. During his 13-year tenure at Fujitsu, he developed and implemented the ROI model and processes used in all internal investment decisions, and also held senior management positions in finance and marketing.

Prior to Fujitsu, Sadovy was with Digital Equipment Corp. for eight years in sales and financial management. He started his management career in laser optics fabrication for Spectra-Physics and later moved into a finance position at the General Dynamics F-16 fighter plant in Fort Worth, Texas. He has an MBA in Finance and a Bachelor’s degree in Marketing. He and his wife Ellen live in North Carolina with their three college-age children, and among his unique life experiences he can count a run for US Congress and two singing performances at Carnegie Hall.

Big: Data, Model, Quality and Variety

A fresh look at big data. It's time to apply "big" not just to the data, but to the model, quality, and variety.

Yelling Analytics in a Crowded Theater

Leo Sadovy reflects on six years of writing his Value Alley blog for SAS and how analytics have matured over those same years.