Comments
Analyzing Big-Data Takes Statistics Galore
View Comments: Newest First | Oldest First | Threaded View
Page 1 / 2   >   >>
Re: Big Data Takes Statistics Galore
  • 10/22/2013 10:16:56 PM
NO RATINGS

@SRS1

RE: could be worst than one might expect

The expertise to address missing data, measurement error, and missclassification (wrong x) is the same needed to identify these problems.  When we talk to people without this training, they say everything is fine.  We do not use statistical validation enough and we do not review results often enough.  Had AIG checked their models, they would have noticed and possibly avoided their problem years earlier. 

Big Data Takes Statistics Galore
  • 10/21/2013 11:57:15 PM
NO RATINGS

As mentioned in the post, lost data leads to incomplete data, which could be worse than one might expect. The question might be how much data was lost and over what time period. Large losses of data could lead to much more than just poor decision making.

Re: Agreed
  • 10/18/2013 7:51:18 PM
NO RATINGS

@CandidoNick,

Exactly, and when is information complete?  Our universe is not deterministic, it is stochastic.  We have a cultural craze for pretending that we know everything.  We 'know' that E = MC^2, yet is does not.  This is an approximation based upon data, E = MC^2 + error.  The only deterministic equations come from deduction or definitions.  Everything else is subject to uncertainty. 

Re: Agreed
  • 10/15/2013 1:13:44 AM
NO RATINGS

'Complete' is a tough word in the data world because there can always be more to discover. More angles to take data from, more means of analyzing, and more conclusions to be drawn. When making Big Data aphoristic, you relinquish even more.

Re: Agreed
  • 10/14/2013 7:15:14 PM
NO RATINGS

I relate.  Sometimes, I get rounded data from the 'black box.'  E.g., someone is 'helping us' by rounding the data to the significant digits and there goes our estimate of the variability.  We might end up with two values, e.g. 0, .1.  Another problem is data-entry software that prevents logical values.  I worked on a litigation where the data-entry software blocked information that we needed. 

Re: Agreed
  • 10/14/2013 5:48:30 PM
NO RATINGS

No, I usually data that has been filtered or cleaned. I always suspect these rocesses- especially when they can't tell em how and why the data was cleansed.

Re: Agreed
  • 10/14/2013 5:31:27 PM
NO RATINGS

@tomsg, Jeff, & CandidoNick,

How you seen this portrayal of Big Data as complete information and if so, what do you think about that? 

Re: Agreed
  • 10/13/2013 8:18:08 PM
NO RATINGS

I agree. The problem is that most wi not realize what they have lost and whether it is important or not.

 

 

Re: Agreed
  • 10/13/2013 7:34:23 PM
NO RATINGS

This is akin to describing something complex in layman's terms. There's a reason it's complex. Normalizing data means meaning is lost in the translation from complex to simple.

Re: Agreed
  • 10/11/2013 12:15:59 PM
NO RATINGS

Ack.  I see that too.  This article talk so much about incomplete data, and I kept thinking, 'yeah what about just plain wrong data?'  Then you put stats and algorthims on top of incorrect observations and/or transactions.  Yikes!

Page 1 / 2   >   >>


INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +