Nobody wants to be in the position of trading speed for accuracy of analytical work, but sometimes, you don't have much choice. The business has questions it wants answered, and you need to deliver -- the sooner the better.
Problem is, "as you're enabling more and more data to become usable, there are more and more questions being asked," Chris Gifford, senior vice president of customer analytics at JPMorgan Chase, recounted during yesterday's "The Big Reveal: What's Your Data Telling You?" session at SAS Global Forum Executive Conference 2013. Maybe the organization no longer wants to know about customer behavior in aggregate, for example, but on an individualized basis -- taking into account all touch points.
And so, you may well find yourself waiting... and waiting... and waiting for your answers as your models chug through all those bits and bytes. And then you may find yourself doing things that go against best-practices wisdom.
You might, as did the Chase customer analytics group Gifford heads, decide to select only some rows in your database to model rather than using all rows. Or, you might only look at some attributes and restrict others, or settle on one algorithm rather than test multiple algorithms to figure out the best solution to a problem. And maybe you wouldn't do quite as much testing of your model as you really ought, explained Gifford, who also told some of the Chase story during Sunday night's combined opening session for the SAS Global Forum 2013 and the executive conference, both taking place in San Francisco this week (watch the session here on demand).
At Chase, Gifford said he knew he needed to figure out a way that would let his analytics team stop cutting corners.
With a goal of increasing the speed and accuracy of the customer analytics models, Gifford said he began investigating high-performance computing solutions, like SAS High-Performance Analytics (HPA), that would streamline the time required for modeling runs. When first seeing SAS HPA in action, Gifford admitted he was a bit skeptical. "Could it really run fast with our models and our data?" he recalled thinking.
The proof-of-concept testing answered with a definitive, "Yes."
During that testing, his team ran a number of its traditional models. One, a risk model for its mortgage business, had been taking about 160 hours -- nearly one full week -- to complete. HPA testing showed the same model, with the same data, running in about 84 seconds. A credit risk model, which took 14 hours using the traditional SAS approach, ran in 180 seconds on HPA, Gifford said.
Needless to say -- although he did -- "those kinds of improvements are fantastic." And with them, we suddenly have options again, he added.
You can continue to chase the efficiencies. You can do more models per statistician per day or week. But you can improve your accuracy if you're willing to trade off some of that newfound speed. You can test additional algorithms or increase the sample counts -- more rows, more columns, perform more tests.
HPA has significantly improved the speed -- 200 times and 100 times performance improvements in the case of the two examples Gifford cited -- while improving accuracy. That's of no small significance for a financial institution the size of Chase. As he noted, "as we reduce the type one and type two errors for this kind of use case, it turns us into being able to say yes to more of the good guys and no to more of the bad guys -- so, yes, there are very big impacts."
At the end of the day, bringing data down to all these servers gives modelers the chance to rethink how they do things, too, he added. "The speed gets them ahead."
In fact, traditional business analytics speeds aren't going to cut it in the future. "As you keep moving more and more data to the edge of real-time decision making, you have to engage in this process."
Are you ready for HPA? Share below.
Here at SGF: Lovin' San Fran & Talking Up HPA
A Healthy Regimen: MPP, MapReduce & In-Database Analytics
Disney's Got Analytics Pizzazz