If you want to make a name for yourself in the sexy data science field, you best get ready to ask a lot of questions. Without them, your modeling will be lackluster at best.
Here's how Mike Swinson, executive vice president at TrueCar, an online car-buying platform, put it during a recent presentation at IE's Predictive Analytics Innovation Summit in Chicago: "The role of data scientists and predictive modelers is to ask the insightful questions, to really hone in on the problem structures and leverage scalable data architectures to the greatest effect." Failing to take the business of asking questions seriously will make the task of extracting useful intelligence from raw data rather difficult, he added.
Let's set the science of data aside for a moment and think instead about the art of it. "This process of asking the right questions and really driving into and framing a problem structure is really what the art of modeling comes down to," Swinson suggested.
And yet, so many analytics teams -- across company and industry type -- shortchange the process, Swinson said. They'll take raw data, pump it into their predictive models, and expect great intelligence to pop out the other end. Voilà!
Far better, as Swinson has shown at TrueCar, is following a four-step process, as shown below.
Moving from one stage to the next takes asking the right questions, he said.
To turn that raw data into useful, contextual information that really frames the problem structure takes asking the right questions. To take that information and move to actionable information again takes asking the right questions so you can structure this in terms of a predictive model that you can actually use to drive results. And then to go from that third stage to an actual strategy again takes actual asking of questions to be able to structure your strategies and make use of your various models... for effective use in your business decisions.
Swinson provided a couple of examples of how to ask the right questions, including how this idea comes into play for use with TrueCar's own dealer scoring algorithm. Via the algorithm, TrueCar aims to present the best dealers, in the optimum order, to consumers when they input their car-buying criteria. "It's a problem not too dissimilar to what you'd face, say, if you're at Google and you're in charge of designing the search engine algorithm."
Starting with Step 1, TrueCar has its raw data to consider -- bits of user-entered information such as location and vehicle specifications, prices and other information from dealers, and third-party consumer data, for example. To move through the process, TrueCar has to ask itself about the consumer's behavior so it can better understand from which dealers they'd most want to buy. Pricing, location, and selection are primary factors, but each needs digging into for transitioning raw data into intelligent information.
On pricing, for example, TrueCar might also have to factor in pricing relative to other dealers and to the manufacturer's suggested retail price. On location, radial distance or proximity between a buyer and dealer, based on ZIP code, probably isn't telling enough. More useful might be driving distance or drive time, and even those could need refinement. "A two-hour drive time in Chicago has a different meaning than a two-hour drive time in Billings, Montana," noted Swinson, adding, "All these things need to be factored into the way we're constructing that variable."
But that's not enough if the model is to zero in each buyer as an individual and not as a composite. "DG might be product sensitive and needs to have that silver car with heated leather seats. Roy, on the other hand, might be very price sensitive. He doesn't really care about the specific product. He doesn't mind driving a little bit further as long as he can get the cheapest, best deal he can get with similar specifications."
As TrueCar peels back the layers, and leverages all these myriad factors in its models, it's been able to "achieve massive increases in profitability and customer satisfaction," Swinson said. As an example, he cited TrueCar's Net Promotor Score, which, at greater than 70 percent, is one of the highest among Internet companies.
It all comes down to this basic reality: Models themselves aren't inherently intelligent.
They're really searching for correlations. They have no knowledge whatsoever of causality, and so this is where we as data scientists and predictive modelers can really inform the decision-making process of the model. By asking the right questions, by structuring not only the individual variables but structuring the model itself, we can really drive to an understanding of causality and really isolate the factors that specifically lead to the effects we're looking for.
Do you ask enough questions during your modeling process? Share below.
@Lyndon_Henry I agree, results should accept and help to narrow ideas from being "too broad" Numerous issues can be identified through these surveys. Work with the most problematic issues such as security challenges, internal problems, etc al. It is important to be able to prioritize, act on survey results with needed improvements.
Beth asks ...so are you suggesting surveying business users/managers about what business problem they're hoping the analytics will help them address? Or are you speaking more generally of surveys, for customer satisfaction, etc.?
Well, certainly, surveying the potential users (managers, administrators, planners, etc.) of the results would be a terrific idea.
But I had in general public surveys more in mind — I've had more experience with those.
In either case, you need to try to define the key issues as narrowly as possible. Otherwise, I think you get survey results that are so broad and mushy they are close to useless (except lots of planners and dcisionmakers like to get results that are so broad and mushy they can interpret them in ways to bolster their own preferred goals).
And if this is supposed to feed into the development of a model, you could really end up with some junk and perhaps a mess.
@Lyndon_Henry -- so are you suggesting surveying business users/managers about what business problem they're hoping the analytics will help them address? Or are you speaking more generally of surveys, for customer satisfaction, etc.?
Beth's article asks Do you ask enough questions during your modeling process?
Where I've found asking (formulating) the right questions to be important has been in surveys. Not that they've all led to analytical models, but they have influenced decisonmaking, and some have fed into the modeling process. Learning to ask the right questions is also helpful in learning how to identify the critical variables that are needed to make good decisions and build more effective models.
Callmebob, very true. I like how this seems to really force introspection into the value of what is being modeled. Modeling data that may take too long of an arc to answer relative to the business need could benefit from this structure.
Here's where I think that the data wizards need to have a bit of a marketing mind. In fact, they should work in consortium to create models that frame marketing's questions that answer their tasks to identify target markets, market research, product development, marketing mix, and monitoring results.
Diego Klabjan, chair of the INFORMS University Analytics Program Committee and program director for Northwestern University's Master of Science in Analytics program, gives his advice for figuring out where to get an advanced analytics degree.
What Works: Open Source Analytics Software International Institute for Analytics WebinarOn Wednesday, Sept. 24, join IIA CEO and Co-Founder Jack Phillips, along with featured guest Gary Spakes, as we explore the five modernization stages that analytics hardware/software have experienced. We will discuss the considerations when calculating total cost of ownership of the analytics ecosystem.
2014 VA Interactive Roadshow -- Cary, NCThe 2014 VA Interactive Roadshow will feature SASŪ Data Management and SASŪ Visual Analytics experts covering topics like prepping data for VA and VA integration with SASŪ Office Analytics. This year's events will keep presentations at a minimum and focus on giving attendees hands-on exposure to the latest version of VA.
Essential Practice Skills for Analytics Professionals Drawing on best practices from the field, this INFORMS course helps analytics professionals add value from beginning to end: listening to clients, framing the central problem, scoping a project, defining metrics for success, creating a work plan, assembling data and expert sources, selecting modeling approaches, validating and verifying analytical results, communicating and presenting results to clients, driving organizational change, and assessing impact.
Analytics 2014 The Analytics 2014 Conference is a two-day, educational event for anyone who is serious about analytics. This annual event brings together hundreds of professionals, industry experts and leading researchers in the field of analytics. All Analytics members save $500 on conference fees by using promo code ACAA.
Premier Business Leadership Series 2014 The Premier Business Leadership Series is an exclusive event for senior executives and decision makers that focuses on solving the current issues that affect governments and businesses globally. The Series is a unique learning and networking experience focused on the most innovative leadership strategies and analytic solutions for competing in todayâs global economy.
2014 VA Interactive Roadshow -- BostonThe 2014 VA Interactive Roadshow will feature SASŪ Data Management and SASŪ Visual Analytics experts covering topics like prepping data for VA and VA integration with SASŪ Office Analytics. This year's events will keep presentations at a minimum and focus on giving attendees hands-on exposure to the latest version of VA.
Data Exploration & Visualization Get hands-on training that focuses on the critical steps in the process of analyzing data: accessing and extracting data, cleaning and preparing data, exploring and visualizing data. This INFORMS course will use several of the most popular software tools intensively, and provide an overview of the range of software options.
Foundations of Modern Predictive Analytics In this INFORMS course, learn about modern predictive analytics, the science of discovering and exploiting complex data relationships. This course will give participants hands-on practice in handling real data types, real business problems and practical methods for delivering business-useful results.
2014 VA Interactive Roadshow -- AtlantaThe 2014 VA Interactive Roadshow will feature SASŪ Data Management and SASŪ Visual Analytics experts covering topics like prepping data for VA and VA integration with SASŪ Office Analytics. This year's events will keep presentations at a minimum and focus on giving attendees hands-on exposure to the latest version of VA.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
The hospitality industry gathers massive amounts of customer data, and mining that data effectively can yield tremendous results in terms of improved CRM, better-targeted marketing spend, and more efficient back-end processes. Roger Ares, vice president of analytics at Hyatt Corp., discusses the ways he and his staff use big data.
Charged with keeping track of travel assets, including employees, iJET International relies on data management best-practices and advanced analytics to keep its clients in the know on current and potential world events affecting travel, Rich Murnane, Director of Enterprise Data Operations & Data Architect, told All Analytics in an interview from the 2014 SAS Global Forum Executive Conference.
Jason Dorsey, chief strategy officer for the Center for Generational Kinetics and keynote speaker at last month's SAS Global Forum 2014, describes how Gen Y professionals are enhancing the makeup of multigenerational analytics organizations.
From analytics talent development to the power of visual analytics, All Analytics found a variety of common themes circulating throughout the exhibition floor and session discussions at the 2014 SAS Global Forum and SAS Global Forum Executive Conference events held last month in Washington, DC.
Talking with All Analytics live from the 2014 SAS Global Forum Executive Conference, Eric Helmer, senior manager of campaign design and execution for T-Mobile, discussed the importance of customer data -- starting internally -- in devising the mobile operator's marketing plans.
The big-data analytics market can be a confusing place. Among the vendors vying for your dollars are traditional database management providers, Hadoop startup services, and IT giants. In this video, All Analytics editors Beth Schultz and Michael Steinhart sit down in a Google+ Hangout on Air with Doug Henschen, executive editor of InformationWeek. Henschen discusses use cases for big-data analytics, purchase considerations, and his recent roundup of the top 16 big-data analytics platforms.
At the National Retail Federation BIG Show last month, All Analytics executive editor Michael Steinhart noted a host of solutions for tracking and analyzing customer activity in retail stores. From Bluetooth beacons to RFID tags to NFC connections to video analytics, retailers must find the right combination of tools to help optimize the shopper experience, streamline operations, and boost revenues.
The days when historical shipment trends and gut feelings were enough to forecast retail demand accurately are long over. SAS chief industry consultant Charles Chase outlines the benefits of pulling real-time sales information from point-of-sale and product scanner systems, then flowing that data into dynamic forecasting tools from SAS.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.