5 Tips for Predictive Modeling Success

Whether you consider yourself a business intelligence expert, a data miner, or a predictive modeler, you have to be smart about how you think about your discipline, treat your data, and work with the business. You might not hear it expressed as such, but common sense is a must.

Set aside your common sense when working with data, and watch out. "It can really bite you," said Dean Abbott, internationally recognized predictive analytics expert and author of the new book Applied Predictive Analytics: Principles and Techniques for the Professional Data Analyst. "The great thing about predictive modeling and algorithms and machine learning is the models are induced from the data. Of course... the biggest weakness of machine learning techniques is that the models are induced from the data," he told listeners of this week's A2 Radio episode, "The Art of Predictive Modeling" (register now and listen on demand).

Dean Abbott
Dean Abbott

This is problematic if the data has problems or is biased in some way, he added. That's because "the model will gleefully go and build whatever it finds in the data, which may not be what you expect or what you really want."

During the radio broadcast, Abbott shared a wealth of sensible advice for predictive modelers. Here are five of his tips:

  1. Know your science, but embrace the art of predictive modeling. As a predictive modeler, you're going to have to have algorithm chops. Your linear regressions, K-Mean clusters, and neural networks aren't going to build themselves, after all. But how you do your analysis may be quite different from the way somebody else does -- and that's OK, Abbott said. "People attack problems in different ways," which gets to the art of the discipline, he said. And, unless you're doing leading-edge work, the science is only going to take you and your model so far. "You [might] build the coolest random forest ever, only to find out it's completely useless because it's not addressing the business objective." In other words, aligning a model to business objective is an art.
  2. Don't dis domain expertise. Along with big data's arrival on the analytics scene is the belief of some in the data community that "just as long as we have enough data we don't need domain experts any more." That's just not true, Abbott said. Data can be a tricky devil. "We don't understand all the ways that data can deceive us, and if we just rely on the data alone, we can be fooled into thinking we have something that's good when we really don't."
  3. Get IT buy-in from the get-go. Just as much as you need to partner with those keepers of the business knowledge mentioned above, you need to involve IT -- or whoever is holding the data -- in your predicting modeling projects. "You need to know where the data is stored, how one can access that data, and that the data means. Without IT buy-in, many predictive models fail," Abbott said. It happens, he added. "You discover all the data you need to build the models, but the IT fiefdom has erected a wall so high that you can't climb over it and you have to change what you do with predictive modeling or you'll never be able to deploy the model."
  4. Don't be married to your model. Abbott said he smiles when he's asked about iterative modeling because it gets exactly to what he finds to be quite useful when working with predictive models. "And that is getting the model out the door for assessment as quickly as possible, even if it doesn't have all the bells and whistles you'd like in it." Building the model oftentimes isn't particularly difficult, after all. "So if you can get to the end and show this is what the model is seeing and get some quick feedback, you have time to correct all the misconceptions along the way. It's almost always an iterative process, where the first thing that comes out of a decision tree or neural net or whatever the algorithm is isn't what you'll be using ultimately." So, build the model, show stakeholders what it's finding, answer all their questions… and make your adjustments, he advised.
  5. Remember the back-end. As you work on your model, make sure you know how it's going to be deployed. Will it be deployed in software or will you need to do an ad hoc deployment? Will the model need to fit into some operational system? If the latter is the case, then you need to know what form the model needs to take, Abbott said. Your modeling algorithm may be easy enough to encode using C, Java, SQL, or the Predictive Model Markup Language (PMML), but not so with all the data prep you've done, he warned. So, if you've monkeyed around with the data, filling in missing values or creating derived attributes for example, you're going to have to redo all those computations. So it behooves you to keep good notes on that data prep. "So if you're pulling data from a data warehouse and then bringing it into your data mining environment and doing all your data prep, keep note of what you're doing and then as much as possible push that back up to the database so that when you're scoring models later you're pulling data from a modeling table that's already got this data prep built in and then push that out to your scoring, or whatever."

For more of Abbott's advice, tune into the show on demand and read through the Q&A on our message board below the player. And share your own common sense advice for working with predictive models below.

— Beth Schultz, Circle me on Google+ Follow me on TwitterVisit my LinkedIn pageFriend me on Facebook, Editor in Chief, AllAnalytics.com

Related posts:

Beth Schultz, Editor in Chief

Beth Schultz has more than two decades of experience as an IT writer and editor.  Most recently, she brought her expertise to bear writing thought-provoking editorial and marketing materials on a variety of technology topics for leading IT publications and industry players.  Previously, she oversaw multimedia content development, writing and editing for special feature packages at Network World. In particular, she focused on advanced IT technology and its impact on business users and in so doing became a thought leader on the revolutionary changes remaking the corporate datacenter and enterprise IT architecture. Beth has a keen ability to identify business and technology trends, developing expertise through in-depth analysis and early adopter case studies. Over the years, she has earned more than a dozen national and regional editorial excellence awards for special issues from American Business Media, American Society of Business Press Editors, Folio.net, and others.

Midmarket Companies: Bring on the Big Data

The "big" in big data is no reflection of the size of the organization embracing its potential.

Push Yourself to New Analytical Discoveries

Take inspiration from Christopher Columbus as you pursue your analytical journeys.

Re: Yay for #4
  • 8/31/2014 12:05:08 AM

What a coincidence! I'll read that shortly. The iterative process may apply not only to modeling for customers; I can see it working for competitors, vendors, and even internal production processes, where occasional changes will dramatically affect the previous models.

Re: Yay for #4
  • 8/29/2014 1:16:43 PM

@magneticnorth, funny you should mention the value of using iterative modeling in finance. We just posted on that  very topic! If you haven't yet had a chance to read the post, you might like to! See: Model & Remodel to Find Profitable Customers

Yay for #4
  • 8/29/2014 3:20:31 AM

I like #4! It's like agile modeling. It totally makes sense in many contexts, especially in Finance. An iterative process will accommodate companies' gradual growth.

Re: Covering the bases
  • 8/25/2014 9:05:59 AM

The tips brought home the idea of getting IT aboard the project. As stated jumping over the high fences sometimes will bring the process to a halt. Getting all those folks aboard will certainly smooth the way for building models and implementing them successfully.

Re: Covering the bases
  • 8/22/2014 10:52:48 AM

Right, @Jim. Abbott clearly knows his stuff, but it's refreshing to hear how much he values the opportunity to learn from others, too -- sitting over their shoulders and watching how they'd tackle a problem versus how he would himself. Not everybody at his level is so willing to set egos aside and open themselves to new approaches and ideas.

Covering the bases
  • 8/22/2014 8:53:50 AM

One of the things that Abbott brought to the radio show was the ability to look at a variety of factors (and people) that play a role in predictive analytics. On one hand he could delve into the technical aspects of building a model -- and the technical chops that are required on the team -- and then he could recognize the value of domain expertise.