Comments
You must login to participate in this chat. Please login.

When will the draw results published and where and how it will be communicated?

Prospector

Thanks to all of you.  I hope you enjoyed the chat.  I did!

Prospector

And, of course, thank you Bill for sharing all your great insight!

Blogger

Thanks everybody for participating. As a reminder, I will be drawing for a free copy of the book, which you can also find here!

Blogger

IFU -

Tons of medical possibilities.  Taking privacy aside for a moment, all of the sensor readings, tests, etc that are tracked these days can yield a treasure trove of insight into disease patterns, treatment effectiveness, etc.  Medical applications could be some of the most beneficial uses of big data over time

Prospector

My post didn't go through for some reason.  GREAT analysis is:

 Guided by a business need

 Relevant to the decision makers

 Explainable and easy use to make decisions

  Actionable with specific recommendations

  Timely and available when decisions will be made

 

 

 

Prospector

Bill, do you have time to answer that one last question, from lfu@hsc ...

Blogger

ritesh -- two separate links but on the same technology. :-)

Blogger

No question, but thanks to our guest Bill for a stimulating discussion.

Blogger

In the context of medical data, we are so used to acquire, store, retrieve and analyze "coded data".  Big Data does not fit too well with such a model.  Do you see any Analytics application in medical data.

Prospector

good day to all :)

Prospector

thank you good info 

Prospector

Thanks Beth and Lyndon for the link.

Prospector

..

Riteshpatel, I think (hope) this is a link to what I was talking about...

http://www.sas.com/technologies/bi/visual-analytics.html


Blogger

Bill, one last question from me: I recall in the book you used GREAT as a way to explain the need for really great analysis vs., I guess, OK analysis. Can you quickly run down what GREAT stands for?

Blogger

Everybody, we're approaching the top of the hour so if you've got any last questions, get 'em now! 

Blogger

Example from my field (urban mass transit) ... Our transit authority (which I helped create) bought video cams for all its buses. This is a powerful data-gathering tool. However, because of cost, it has so far activated only a fraction of these. 

I might point out that a lot of passengers think they're being recorded and their conversations monitored, whereas only a minority of buses actually have active cameras. Must be some kind of moral there somewhere...

Blogger

Bill, ah, OK. I would definitely expect the etailers in there.

Blogger

riteshpatel - ETL gets to the heart of the challenge with big data.  The MapReduce framework is perhaps at its best in doing the complex ETL required to extract valuable, structured information from big data.  It helps scale that process so you can extract the pieces you need to flow into traditional reports and analysis

Prospector

Beth,

I would rate e-commerce and online companies as the clear leaders.  After all, Google, e-bay, Yahoo and others almost defined the whole concept of big data.

Telecommunications companies are also very good.  Some of the work they are doing with detailed information about tower transmission of calls and such is very fascinating

Prospector

Lyndon_Henry, can you please share the link, if you have it handy.

Prospector

Issue I have faced with different tools is that I need to spend huge amount of time in preprossing data before tools can use them and deliver results at resonable speed.

This preprocessing ETL again has limitation as they reside in Database layer.

Prospector

There was a blog on this forum recently about the SAS visualization package, able to gulp down and process amazing amounts of data really fast.

Blogger

Lyndon - you hit on a big one...Privacy.  It is a huge issue with big data.  There have been news stories almost every week about some new privacy issue that has arisen.  this will be a very important topic as big data evolves.  Security of data will only become more important

Prospector

Fair enough, Bill. But I am surprised that retail may be behind in big-data. Why is that? And if it's behind what industry is ahead?

Blogger

Rka, I haven't tried but planning some demo soon.

Prospector

Speaking of the sampling issue and Big Data ... The technological capability seems to be there to gather vast amounts of useful data, but implementing this technology can get expensive. What seems to drive a lot of the investment is marketing needs (private sector) and security (public and private). These provide a rationale for allocating heavy investment into the resources needed to collect, digest, and analyze the data.

This all then triggers the privacy etc. issues that have a lot of the public spooked...

 

Blogger

Beth

The reason for my retail examples is much less meaningful that it would.  I have a soft spot for retail and have done more work in the industry over time than in others.  So, just top of mind.  In fact, I would classify retail as somewhat behind the curve with big data in most cases.

Prospector

Ritesh have you explored SAS visualization package, I dont recall what size of data it is able to handle, but 5TB sounds right. We attend a demo recently and I found it very powerful.

Prospector

Bill, retailers of varied sorts seem to crop up in your answers a lot! Is that because retail is more advanced in use of big-data and big-data analytics than other industries?

Blogger

Beth -

Yes, I have seen innovation centers in practice.  In many ways, the role of "data scientists" in many of the technology businesses are effectively running innovation centers, thought they don't call it that.  They are tasked with finding new and interesting ways to use data to help the business.

I have also seen "old school" businesses adopt it.  One grocery store chain set up a team to try all sorts of new customer, assortment, and forecasting analtyics.

Prospector

riteshpatel - I think you've hit on a key point.  Most of the visualization tools only scale so far as of today.  You can't necessarily visualize all 5TB of your data.  We're probably back to our other discussin thread...you may have to do some sampling to use some of the tools today.  For many questions, this will be just fine.

Prospector

Bill, have you seen such innovation centers in practice? If so, any examples to share (if not by company name then by the type of innovative tasks undertaken and knowledge gained?).

Blogger

Can you give example of such visualzation tool. As a team we have tried OBIEE, Tableau, Microstrategy, inhouse built application but everything fails after certain point. The reason is what exactly you mentioned - Inflow is so high that application gets absolute before realizing value of existing solution.

Prospector

Beth - an innovation center is a mix of people and tools with some expedited processes.  The way it usually works is that some system resources are dedicated to the innovative tasks.  And, if not a few full time people, some set percentage of some people's time.  The key is to put some formal priority on the experimentation and innovation in analytics.  Many companies have R&D departments for their products, why not also for analytics?

Prospector

"not sampling just causes a lot of extra processing for no extra benefit" - YES!!!

Increasing the granularity of the data (I had to avoid the word sampling!) just because of the availability of hardware and storage doesn't yield any greater insight

 

Prospector

Bill, to tag team on Lyndon's question, does big-data analytics require a different sort of analytics professional that we've known to data, in terms of technical skills and, perhaps, business knowledge?

Blogger

BI / DW project starts and fly off great with sample / demo but when it goes on live data, boom.. Data Quality, performance, scalability, long life cycle hits the value.

Prospector

riteshpatel.  There are numerous visualization tools out there and they are getting better every day.  Many have started allowing in-memory analytics as well.  I think business people don't much care about cloud vs server vs pc tools.  They just want something that meets their needs.  There are tools out there that are doing some neat things with visualization

Prospector

Bill, you noted 

>>Another approach is what I call an "innovation center".  This is where you have people tasked with exploring data proactively and experimenting to see what analytics it can drive.  The idea is to let the experimentation drive the requirements.<<

Any examples of how this is implemented in actual practice? Where is such an innovation center in operation in an actual organization? Personnel dedicated to it, or just an adjunct to other tasks?


 

 

Blogger

I am really wondering if there is any tool in industry that can allow to do powerful visualization on any kind of raw data. Be it sales data, or web traffic or spending. Different business group wants to view it differently.

Best way is to have visualization tool that can run securely in the cloud and allow capability of desktop tool.

Prospector

RKA - good question on sampling.  Sampling can still be relevant in the world of big data.  As always, it depends on the scope of your analysis.  If you just want to know what percent of customers visit a certain part of a website, a random sample of customer sessions can work just fine.

In fact, I think there are times when not sampling just causes a lot of extra processing for no extra benefit.

Prospector

Computers getting faster, with huge storage and memory features, lot of what used to be considered a large dataset can be handled very easily internally. Excel used to have a limitation of 256 columns, but not anymore - more like 64k or so.

So, probably one has to clearly cross the terabyte (at least) level before claiming to be "Big".

The other day I heard someon erefer to zetabytes or so. Must be a milliion terabytes/

Prospector

pkrishna bought a good point. and my questions is what happens to datasampling and what happens to the way analysis was done in the past via extensive sampling. Are the past analysis obsolute 

Prospector

Beth, yes it is possible for projects to be small in scope, yet require a lot of data. 

I know of a retailer who simply identified people who browsed products but didn't buy.  that required processing through a lot of data, but the actual analytics and mechanics were simple.  They got a huge ROI.

Starting small makes a lot of sense

Prospector

Bill, you responded to bkbeverly by saying that big-data might require analysts to think differently. But does big-data necessarily mean big project? I fear sometimes that that's the inference when it doesn't need to be. So I guess another way of asking this is, Can big-data projects be small in scope?"

Blogger

Certainly what is "big" for Amazon or e-bay is not nearly the same as what is big for a small online retailer.

Prospector

pkrishna

The definition of big data is something that is constantly debated.  The most widely accepted definitions are that big data is something bigger than your current/traditional tools can handle well today.  So, you either have to upgrade to more of the same and/or add new tools to the mix.  So, what is big data in one industry or for one company may not be big for another

Prospector

bkbeverly

Certainly some of these sources are much more granular than the past.  Take that sensor data I discussed.  We are talking data at millisecond or less levels.  Far different from weekly break/fix reports.  So, I do think some of this data requires thinking differently. 

I think the key to resolving conflicts is to think through how the various data sources can be connected effectively.  Perhaps not in all cases they can 

Prospector

Pardon me if I am being too elementary. Just the sheer size of the data doesn't qualify it as "Big Data" - would it? We've been collecting EEG data with 64 channels, then 128, som with 256, at 10khz, and after a fre hours, it becomes huge, but still quite homogeneous. So it won't be considered "Big Data" - right?

Prospector

Interesting quesion, bkbeverly

Blogger

Bill, back to my infrastructure question, you say it may be beneficial to get a MapReduce environment -- what are some ways a company can recognize it's at the stage where it needs to start thinking about this?

Blogger

Hi Bill,

From the standpoint of temporal consistency, what assumptions do you make about big data?  Does it all represent the same time period?  The context of the question is that I think with big data, we are trading time for space.  We acquire a lot of rich substance, but I wonder if there are major variations in the time periods that big data represent. If that is the case then you are trying to match synchronic dynamics with diachronic dynamics - basically trying to find relationships between data elements that are temporally out of sync. Thoughts?

Data Doctor

Beth, good question on internal vs external.

Back to my earlier point that being big or being unstructured doesn't inherently say anything about value.  Similarly, internal or external doesn't inherently matter.  If there is a valuable external data source you can get your hands on and it improves your analytics, have at it!

Prospector

Bill, you mention exploring data -- does this include external data sets (more public stockpiles seem to be coming available each day!) or internal only?

Blogger

Beth, yes infrastructures may need updating.  For the initial handling of big data, it may be beneficial to get a MapReduce environment such as Hadoop or AsterData

Prospector

Another approach is what I call an "innovation center".  This is where you have people tasked with exploring data proactively and experimenting to see what analytics it can drive.  The idea is to let the experimentation drive the requirements.  You don't have it figured out up front, so you experiment as a starting point.

Prospector

So it's one thing to talk about this in theory, but what about in terms of actual implementation. If you've got all this great new information/data at your disposal does that mean you need to rethink what you have in place for analyzing it? In other words, won't big-data analytics tax most existing infrastructures?

Blogger

Good question riteshpatel. 

It should start with identifying some business problems.  Then, brainstorm on if and how a given data source can help that problem.  If you find a match, do some experimentation.  Another approach...

Prospector

My questions is how do we identify potential of data without lot of requirement exchange with business.

 

Prospector

In the past, they would only have seen when things break and then tried to see if they could figure out why.  The sensor data allows them to see much more cleanly.  And, to identify early warning indicators to be more proactive.  That can change everything

Prospector

Back to the point of having a new set of information that you didn't have in the past.  That can only improve your forecasting and planning processes.  Just imagine how much better manufacturers of machinery and engines can assess the lifecycles of their engines with the masses of sensor data on pressure, temperature, etc throughout the lifecycle.  In the past...

Prospector

I'm curious in particular about planning and forecasting best practices myself, too!

Blogger

A few of my points might apply to WaltDitto's question.  But let me add something to further address it as I did miss it...

Prospector

Bill, I wanted to be sure you'd seen the question from WaltDitto: Bill, In what ways are you seeing big data expand the frontiers of what organizations are capable of doing? In particular, how is big data pushing back the horizons of planning and forecasting best practices?

Blogger

What is really important is what you will do with big data to drive value.  that's true of any data.  The fact it is big or unstructured really doesn't matter when it comes to deciding if you need to use the data and what value it will drive.  It only matters to the extent that it impacts what tools and techniques you may have to use.  But the important decision is that the data has value or not

Prospector

Yes.  Big data isn't just unstructured data.  Much of it fits that definition, but not all of it.  And, not all unstructured data is big data.  In fact, I think too much emphasis is being put on the defintion of big data lately.  Let me explain...

Prospector

So Bill, a lot of people associate big-data with unstructured data. Is that too simplistic a view?

Blogger

Looking at browsing history, for example, takes us beyond just what a customer bought and what offers they replied to.  We can now see how they shop and what they are thinking of buying.  That makes analytics much more powerful and predictive.  Similarly, a lot of the sensor data is new information. 

Prospector

ah, I was just going to ask you, Bill, what's different about big-data as opposed to any other sort of data you might have had to add into the mix? But seems you're planning on answering that anyways!

Blogger

One of the biggest benefits of many big data sources is that they are often completely new information that is not redundant to other data you already have.  I will explain...

Prospector

Bill, In what ways are you seeing big data expand the frontiers of what organizations are capable of doing? In particular, how is big data pushing back the horizons of planning and forecasting best practices?

Prospector

As an analytical professional, I have always wanted to get all the data I could in order to address a given problem.  I now have to add big data to the mix.  It may require some extra work in some cases, but the goal is still to extract meaningful insights from it. 

 

 

Prospector

Well, certainly big data does provide some challenges.  Some new tools and approaches are required to handle it.  But, many of the same underlying analytics principles still apply fully to big data.  For example...

Prospector

Bill, so nothing to fear, in essence? Or at least not if you're used to handling data and analyzing it?

Blogger

Yes.  As I saw the hype building around big data, I thought it was important to provide a more grounded perspective that focused no actually making big data work.

One major commonality is that we've always struggled with the data we have.  It is always a bit too big and tough to analyze.  From that perspective, big data isn't much different, just the next wave

Prospector

Bill, commonalities such as?

Blogger

And you wanted to write this book on big-data? 

Blogger

The last decade or so, I've been focused on very large companies and how they do analytics.  Big data is another wave of challenges for companies.  So, I wanted to tie the big data trend into some of the general analytics trends we've had the past decade or two.  There really are some commonalities

Prospector

Sure.  I am an analytics person by training.  Master's in statistics, etc.  I've been working with companies to help them analyze their data my whole career

 

Prospector

As we wait for everybody to join in, why don't you give us a quick background on yourself and the motivation for writing this book?

Blogger

Hi Bill -- sorry about that. My posts seem to be a bit delayed! Welcome everybody.

Blogger

Hi Bill. Are you there?

 

Blogger

Hi everyone.  I am here.

Prospector

could not wait :)

Prospector

Hi batye (and anybody else out there!). We'll be starting today's e-chat at the top of the hour. 

Blogger

Hello to all

Prospector

Looking forward to the discussion!

Prospector

Here's where we'll chat with Bill Franks, chief analytics officer at Teradata and author of the recently released book, Taming the Big Data Tidal Wave: Finding Opportunities in Huge Data Streams with Advanced Analytics. 

Blogger


Latest Blogs
HIPAA is supposed to protect the privacy of your medical information, but the sale of anonymized medical data and advances in analytics have made it possible for organizations to re-link your name to your private records.
Career prospects are hot for both data analytics pros and cyber security specialists. Now CompTIA is introducing a certification that combines these specialities to apply analytics to cyber security.
There has been plenty of talk about the need for a chief analytics officer or chief data officer. But do you ever wonder what they do for a living?
Visual analytics tools, can enable non-statistician citizen data scientist to create models that use predictive or prescriptive analytics. These professionals can embed their business knowledge into the models they're testing, prototyping, or building.
Are your digital marketers creating gaps in customer privacy? They could be unless they vet tags and third-party app access.  Here's how to do the vetting.
Radio Show
A2 Conversations
UPCOMING
Jessica Davis
Our Bodies, Our Data: Medical Records For Sale


2/21/2017   REGISTER   1
ARCHIVE
Jessica Davis
Energy Analytics: Using Data to Find Savings


2/14/2017  LISTEN   44
ARCHIVE
Jessica Davis
Sharpen Your Analytics & Data Management Strategy


2/8/2017  LISTEN   74
ARCHIVE
Jessica Davis
Analytics: Make the Most of Data's Potential in 2017


1/19/2017  LISTEN   19
ARCHIVE
Jessica Davis
A2 Radio: Can You Trust Your Data?


12/20/2016  LISTEN   70
ARCHIVE
James M. Connolly
Retail Analytics: See Where Style Meets Statistics


12/6/2016  LISTEN   53
ARCHIVE
James M. Connolly
Why the IoT Matters to Your Business


11/29/2016  LISTEN   45
ARCHIVE
James M. Connolly
Will Data and Humans Become Friends in 2017?


11/22/2016  LISTEN   40
ARCHIVE
James M. Connolly
We Can Build Smarter Cities


10/20/2016  LISTEN   31
ARCHIVE
James M. Connolly
Visualization: Let Your Data Speak


10/13/2016  LISTEN   70
ARCHIVE
James M. Connolly
How Colleges and Tech Are Grooming Analytics Talent


9/7/2016  LISTEN   56
Information Resources
Quick Poll
Quick Poll
About Us  |  Contact Us  |  Help  |  Register  |  Twitter  |  Facebook  |  RSS