Pass the Salt Along With That 'Data Science'
Re: But George Bernard Shaw said ...
  • 7/30/2013 9:35:46 PM

In my PhD qualifying exam one of the issues was the critical need not to confuse correlation with causation. If academia practiced that about 80%+ of what passes for research would be thrashed.

Re: TLA, the three letter acronym for " three letter acronym"
  • 7/30/2013 9:32:06 PM

What you are describing are two challenges that reality poses to science: complexity and limited controllability. The former is patially addressed with simplified models of the reality being studied, the latter with statistical controls. But both methods cripple the requirements of science and one must interpret any results with extreme care.

I doubt that what is going now by the label "data science" now is appreciative of that.

But George Bernard Shaw said ...
  • 7/30/2013 8:19:45 PM

George Bernard Shaw was quoted as saying "There are three kinds of Lies: Lies, Damned Lies, and Statistics." Example: Wisconsin has the highest per capita consumption of cheese and also has the the highest incidence of rectal cancer therefore Cheese causes Rectal Cancer.

TLA, the three letter acronym for " three letter acronym"
  • 7/30/2013 8:07:13 PM

My Father, a noted Reliability Engineer, took umbrage at IEEE starting a section on Software Reliability: "software is a logical mathematical construct and either works or it doesn't". However systems can be so complex that they are better described empirically rather than heuristically, "Data (Pseudo)Science" purports to do the opposite. I think of the trillions of data points that Boeing generates testing wings, processed thru supercomputers to a convergent solution; contrasted to long term weather prediction where the huge number of variables that interact with each other is in a constant state of flux. Perhaps a simpler system to understand (and critical to our future survival), (as its state is "reasonably static") is the orbits of objects in the Asteroid Belt: known to a fairly accurate degree yet random variations of Solar Wind (Solar Flares are unpredictable), acreation and disintegration of objects, new external matter, planetary positions, action of resultant asteroid motions on planetary positions, etc. Similarly, (c.f. Heisenberg uncertainty principal) Humint (Human intelligence; people data) can be recursively perturbed simply by asking survey questions.

Re: Blame the media!
  • 7/29/2013 12:52:01 PM


You can call it research, but it's not scientific research, which is what your father meant. That's why I refer to it as systematic study, but even that might be too much because it's not that systematic.

I would not be surprised. Anybody who was then on CS won't forget me :).



Re: Blame the media!
  • 7/29/2013 10:50:20 AM

Ha -- yes, this reminds me of when my Dad (eminent materials scientist, responsible for such things as chromium dioxide and gallium arsenide) used to make fun of my Mom, who worked for a state agency as a "Research Analyst," keeping track of how the agency's services were utilized by their clients (blind people).  "That isn't research!," he'd say.

Fabian, your name is familiar.  Do I remember you from... CompuServe, back in the day?

Re: Blame the media!
  • 7/22/2013 12:30:38 AM


I thought my piece made very clear what the incentives and opportunities for "data science" are.

Statisticians are methodologists. They can find a correlation between some medical variables, but that must be interpreted: what is the the physiologic mechanism that gives rise to the correlation,is it maybe spurious, .Medical knowldge is also required to assess whether variables measurwe the right thing, whether right controls were in place, etc. etc. That's why in the life sciences you see long lists of authors, some health specialists and some statisticians, mathematicians or computer analysts.

It's the same in business or any other substantive field.

We had an excellent recent example: Black-Sholes model of pricing options. Was used as if it was a law of nature right at my alma mater, UoChicago by quants who crunched "big data". Got Nobel for it too. Then the market crashes invalidated it by exposing one basic assumption: market conditions don't change. A similar and related case was the collapse of LGBT (if I recall the name correctly).

People get enamored of math and stats and data and computers, call it science and wish the world to behave accordingly and predictably.


Re: Blame the media!
  • 7/21/2013 11:15:16 PM

"Analysing data is a straightforward matter for statisticians or engineers, but not for many executives. As a result there is demand for more commercially savvy IT professionals, known as data scientists, and business schools are responding to this call." There's the exact quote from the article I referenced from FT. You can't make this stuff up!

A couple of items
  • 7/21/2013 10:33:24 PM

Came across these two items that I think have relevance for this discussion.

See if you can figure out why.


Statin Nation (VIDEO)


Google Surveys Can Make Anyone A Professional Pollster TechCrunch



Re: Blame the media!
  • 7/21/2013 1:22:47 AM


That may be the case for natural science.

I had a natural science education (lots of math, physics, chemistry, stats) and when I ended up in social science I was quite disappointed and left academia.

Independent, critical thinking, logic and reasoning are some of the most important components of a science education; another is the history of the chosen field. Not much of that remains--it's "theory and therefore not practical". It's mostly tool training these days.

