One of the great things about my work is that I get to travel and meet really smart people the world over. On Election Day, for example, I happened to be attending an analytics event and had the opportunity to do plenty of offline conversing on who would win.
Just imagine a bunch of quants sitting around a lunch table talking about all the various, possible election outcomes, their related probabilities, the evils of the Electoral College, the possibility of an electoral tie, and on and on. Aside from the obvious danger of waking up nose-deep in a garden salad, that's a truly fascinating conversation, right? Well, for me, not so much.
Why? Because who would win had been obvious to me for quite some time. The data were pretty clear.
Unfortunately for me, I hadn't quite gotten around to starting that blog I've been threatening to write for the last couple of years, and so I hadn't published my erudite, quantitatively sound, and ultimately correct prediction. Some guy named Nate Silver did.
That was my reaction, anyway. Apparently I've somehow managed to avoid hearing about the predictive wunderkind who took the political prognostication realm by storm back in 2008. Fortunately, the event's host had the foresight to give each attendee a copy of Silver's new book, The Signal and the Noise: Why So Many Predictions Fail -- But Some Don't. It had been waiting for me on my place setting that very morning. That was great, because I love books, but too bad, because I just don't read them anymore. Don't get me wrong -- I'm a voracious reader -- but the only time I read something that's printed on actual paper is during that time on the airplane when I have to turn off my e-reader between the ground and 10,000 feet. The only notable exception is the output of one tree-killing colleague of mine who insists on printing absolutely everything. You know who you are.
Anyway, not wanting to spring for the e-reader version with a perfectly good -- if somewhat anachronistic -- actual book in my hand, I decided to wade into the pulp. It turns out that Silver worked as an economic consultant for a big accounting firm, got bored with that (unimaginable, I know), wrote some software to predict baseball player performance ā la sabermetrics, and then founded a political blog, now on The New York Times, called FiveThirtyEight. It was on that blog that he correctly predicted the outcome of the 2012 presidential election in all 50 states. Four years ago, he only got 49 out of 50 correct (darn that pesky Indiana). It might be easy to argue that 2008 was a fluke, but repeated success is somewhat more difficult to discount.
Because of that success, Silver has received a lot of attention. The question I ask is not how he did it, but rather why his success seems to be such a singularity (setting aside the fact that I wasn't in the game yet). Why aren't lots of other people coming up with the similar predictions and similar rates of accuracy? Perhaps others are, but there are certainly a lot of people trying who aren't, and most of them, sadly for us, seem to have found employment as TV talking heads. I believe one possible answer to my question is foundational to the big-data challenge, and it's embodied in my new favorite quote. From the introduction to Silver's book:
We face danger whenever information growth outpaces our understanding of how to process it.
Does this problem sound familiar to anyone?