Some plugs: Find me @doug_laney on Twitter. Join our #GartnerChat tweet-up on analytics, big data, collaborative BI tomorrow at 1pm ET. And dont forget about our BI/Analytics summit in LA Apr2-4! See you there! Cheers!!
Yes, the "data scientist" title may be in vogue and upset some folk's sensibilities. But the reality is that companies actually are hiring and deploying talent to take them beyond basic BI. Call it what you will.
Doug - I can understand the question but field knowledge and company knowledge help me understand the data. I work a lot with federal data sets on the science education and science workforce and you have to understand how the data was collected to know the strength/weaknesses of a variable even over time. It takes time and many conversations to get there.
In general, the best answer for how to define and develop the role of the data scientist is - 'it depends'. Whether the organization is private sector, public sector or academia; large firm or small firm, long time in the industry or a brand new competitor; whether the investigative results will impact profits or public policy, etc. - all these factors impact how the DS role is defined and deployed.
As to where to sit a DS-- If working on optimizing a particular business problem, then very close to the business (with access to IT/data). And if working more on broad innovation/discovery, then perhaps in a specialized analytics group. My first summer job in IT was as a summer SAS programmer at Abbott Labs in the Biometrics Dept--servicing the deep analytic needs of the entire company. It's a good org model to think about for any kind of business today.
That is where training in methods comes in - whether to approach an investigative opportunity analytically (disaggregating the data) or synthetically (aggregating the data). It is the difference between hypothesis testing and data mining, but it all relates to the approach and which statistical tools are best to do the job. Subject matter expertise is great, but if you are just starting out, you can pick up the subject matter knowledge as you go along.
Louis - I think you are correct and that then brings up the issue of knowledge of the business/field. Would I be effective if I was not familiar with local customs - standard data pieces, views, etc. But I think what makes a good DS (and BA in that case) is learning those quickly and adapting.
Isn't it also part of the data scientists job to advise on data to collect? In program evaluation, it is the evaluability assessment step -- are you ready to be evaluated - is the correct data points in place. If not, do we step in and tell the business folks what to do? So far, this all sounds post-hoc to me.
There's some thinking among my Gartner colleagues embroiled in this debate that the DS perhaps focuses more on causation and the BI analyst more on correlation. I think that's a fair simplified argument.
Good Q Shawn. How industry/function knowledgable does a DS have to be? Are the principles and therefore skills transferable across domains? To some degree, yes, I think. A good DS could probably be a good DS in any industry and with almost any problem.
Beth, Briefly data modeling is about logically organizing and integrating data to optimize access. Business modeling is about understanding the way processes work, are affected/enabled by data to optimized performance. But this crowd probably already gets that.
Doug, here on AllAnalytics we often circle around the data vs. intuition aspect of building a fact-based decision-making business culture -- and I think we're at it again when you talk CFO vs. marketing. The message being, you can have data scientists galore, but we still need the business savvy to succeeed
Good point Pierre, often University focus more on theory, real world application has to be stressed but yes the skills they garner will be for the most part outdated by time they enter the field. But I guess it is a start and we have to start somewhere.
Gary, Yes my friend and former colleague Tom Davenport is right on the...er...money. CFOs have typically led when it comes to having influence and certainly basic quant skills. But I think those that come from marketing can (and often are) sharper quantitatively. The issues are more complex there.
Doug, any thought's on Gary's question, "Doug ... my question about if the CFO function might drive the adoption rate of analytics is due in part because Tom Davenport gave a keynote presentation at a major CFO conference speculating that the CFO function has both influence plus quantitative skills. Your thoughts?"
I agree with Beverly. I am a nuclear physicist by training but worked on the computing analysis side. I needed a little formal computer science training but that comes easier than training the science/analysis/statistics skills.
Unfortunately, I don't think there's a silver bullet for cultivating the skills. Certain BI analysts will rise to develop real stats & business modeling skills. And certain business people will rise to develop the data-management part of the equation. But implementing a corp culture that values data, facts, analysis is a fertile ground.
Doug ... my question about if the CFO function might drive the adoption rate of analytics is due in part because Tom Davenport gave a keynote presentation at a major CFO conference speculating that the CFO function has both influence plus quantitative skills. Your thoughts?
Doug, I liked that you framed "business model problems" instead of the data focus - it raises the question of how academia keeps up with changing technology - when one person enters a program, the needs may change by graduation
It would be nice to translate this discussion into organizational design - the definitions are interesting, but what does it mean from a practical standpoint. Any best practice in terms of how organizations are structured to most effectively address everything that has to happen for successful analytics?
The professional skills, knowledge and abilities are often derivaties of college training in social, behavioral and physical sciences. The college environment provides a foundation for systematically deploying intellectual curiousity in terms of methods and statistics. From there, industry-specific experience, mentoring and post-graduate studies do the rest. Folks of this ilk are just like camera film - they just need time and exposure to develop.
Trainer = Analyst, knows how to put the subject to work.
Biologist = Data scientist, knows the ins and out of the subject. The other two professions use the scientist's expertise to accomplish their missions. "Vets" don't need to know how the animal's organs developed or what other uses there may be for them, they just need to know how to keep them working. "Trainers" care more about HOW a subject will behave given a certain stimulous over WHY it does.
So I ran a little "data scientist" experiment myself. I sucked several dozen job descriptions of "data scientist" and "bi analyst" into a wordcloud generator and did some analysis. Have a look: https://docs.google.com/present/edit?id=0Aa7E6TDaLOQNZGRzejJ6ZHhfMTg1ZGo3N3YzY2M
Anyone who can model business problems, gather and prep data, develop and test hypothesis, drive SAS, SPSS, R, etc. and persuasively articulate findings should feel free to call him/herself a data scientist.
smkinoshita -- I'm on the same page with you. But I agree too with those who call for more of a collaborative view of data science, too. yes, have your formal -- ie titled -- data scientist, but that person shoudl be workign closely with the data managers, IT, the business, etc.
Here we go. Some Gartner role defs. Feel free to use them (or not).
Information/Data Scientist: Responsible for mining, modeling, interpreting, blending and extracting information from large datasets and then present something of use to non-data experts. These experts combine expertise in mathematics-based semantics in computer science with knowledge of the physics of digital systems.
Business Analyst: Interpreter of business processes, workflows and requirements into functional specifications for development or application evaluation that will be used by various development teams, process owners and practice areas (such as business intelligence, business applications and information management).
For me, the role of a Data Scientist is beyond the typical role of a BI analyst. A Data Scientist not only must have knowledge about the common methods that we can find for doing data mining but also he or she must be capable to create new ones to tackle new kind of data whether they come from the business area or some other radical area
I think there is a statistical analysis component to it that transcends the tech knowledge. you have know about data analysis techniques and statistical rules to interpret data correctly. No data set is complete.
From an epistemological perspective, a scientist is a knowledge seeker who uses methods and tools to make an objective assessment. Is this consistent with your concept of a data scientist? And if so, do you see the data scientist as being a describer of data only or should s/he seek to establish normative expectations -what the data should be?
Doug, my question is about organizational change management. Which function might drive the adoption rate of business analytics? Is it possible the CFO's function will as they grow stronger in being a strategic advisor ... KPIs, customer profitability analysis, etc.?
I think it's also interesting to consider the other roles that support the data scientist - since data scientist expertise is in short supply, I think we really need to promote the work that IT and business can do to offload the data scientist work - things like IT really understanding the data requirements for analytics so they can help more effectively with the data preparation. What do you guys think about that?
To me, "Data scientist" sounds like someone experienced in measuring and interpretting massive quantities of data as well as knowing how to devise ways to measure things previously thought to be too difficult to measure.
Here's how you described it Doug. "The role of the data scientist -- Defining the role. Is it hype? How is it different than a BI tool jock? Where to find and how to cultivate talent? How industry knowledgable do they need to be? Where to place them in the org, How to get the org to understand what they can bring? etc. "
I think it's funny how people like to create titles which will baffle others into what exactly they do. I mean, from a marketing standpoint it can be helpful if people are interested enough to inquire what it is and thus allow for a pitch, but other people will dismiss the person as a charlatan.
Doug, I don't mean to get too far ahead of the discussion, but I recently saw a u-tube clip on what it took to become a data scientist, and this person who was working in the field as one, basically lucked into an opportunity to consult a company in a technical compacity and it eventually evolved into a Data Science position. Do you see that being the case for many until some formal structure is in place ?
Please join us for a conversation with Doug Laney, VP of Research for Business Analytics and Performance Management at Gartner, as we discuss defining the above role. How does this role differ from others in the business intelligence field? Where do companies find such individuals, how much do they need to know about the industry as a whole, and where do they fit into a company's organizational chart? Doug will be guiding our discussion, posing some questions and sharing some research. Hope you can all join us!
Tech Marketing 360 The only event dedicated to technology marketers. Discover the most current and cutting-edge innovations and strategies to drive tech marketing success. Hear from and engage with companies like Mashable, SAS, Dun & Bradstreet, ExactTarget, Google+, IDC, Microsoft, LinkedIn, Oracle Eloqua, Leo Burnett, Young & Rubicam, Juniper Networks and more – all in an intimate, upscale setting. Register at http://www.techmarketing360.com with priority code CMANALYTICS14 to save $100.
SAS Global Forum Executive Conference 2014 The Executive Conference is held in conjunction with SAS Global Forum, a SAS users technology event. Investing in thought leadership and technical training are two of the best moves a successful company can make so take advantage of the world-class speakers, sessions and discussions around Analytics, Big data, Risk, Fraud and Data management.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
David Tishgart, senior director of marketing and alliances at security provider Gazzang, explains the importance of data encryption for companies that are rolling out Hadoop environments to leverage big data analytics.
At the Strata Conference / Hadoop World 2013, Samuel Kommu, technical marketing engineer at Cisco Systems, shares some of the benefits that Hadoop brings to analytics platforms that leverage next-generation hardware. Kommu looks at big data operations that required 3,500 nodes in 2009, 2,000 in 2011, and now require only 64 nodes.
Wayne Thompson, manager of SAS Data Sciences Technologies, delivers a fascinating preview demonstration of SAS Visual Statistics, a tool that enables fast and flexible modeling against massive datasets on the fly. Visual Statistics will be made generally available in March, but you can see it here first.
At Strata/Hadoop World 2013, Cloudera CEO Tom Reilly discusses the new Enterprise Data Hub offering, explaining how it works with Hadoop, how it creates a single repository of full-history and full-fidelity data, and how it exposes that data to all users interested in exploratory analytics.
At this year's Strata Conference/Hadoop World 2013, SAS big data vice president Paul Kent presented a session on setting up Hadoop clusters for advanced analytics. We caught up with several audience members and recorded their impressions of the presentation.
In hearing directly from a doctorate-level Hadoop specialist, a healthcare data analyst, and a marketing executive, it's clear that big data analytics is a burgeoning field that cutting-edge companies are eager to explore.
At this year's Strata Conference/Hadoop World 2013 event, SAS VP of Big Data Paul Kent presented several sessions about modernizing and deploying advanced data analytics infrastructures based on Hadoop. In this video, he talks about the state of Hadoop adoption among enterprises today and looks out to the big data-driven applications of the future.
Companies that use SAS analytics tools for their traditional databases are looking to derive even more value by mining unstructured data. Data management platforms like Hortonworks enable that relationship by delivering an enterprise-ready Hadoop framework.
In this video, Shaun Connolly, vice president of corporate strategy at Hortonworks, explains how companies can incorporate Hadoop into their data analytics streams.
At the SAS Premier Business Leadership Series in Orlando, Manuel Sanchez, CRM Manager for Club Premier Aeromexico, explains the challenges and opportunities of transaction data. Using dozens of data sources among participating airlines and merchants, Club Premier creates robust customer profiles and works to maximize benefits for members and business partners alike while protecting individual privacy.
At SAS's October Premier Business Leadership Series (PBLS) in Orlando, attendees from the corporate and academic worlds joined thought leaders and analytics professionals to share insights and strategies around big data.
Will Hakes, CEO and co-founder of Link Analytics and keynote speaker at the SAS Analytics 2013 conference in Orlando, Fla., last month, talks candidly about the challenges that large enterprises face as they explore advanced analytics solutions. He also shares some practical tips for smoothing the transition.
At the SAS Analytics 2013 conference in Orlando, Bob Gladden, vice president for decision support and informatics at the Ohio nonprofit health insurance provider CareSource, explains how his company uses advanced analytics to keep administrative costs down and to identify at-risk patients for targeted healthcare initiatives.
At the Analytics 2013 conference in Orlando, Fla., two analytics experts from Dell -- global decision sciences manager Natalie Kortum and senior credit risk consultant Jack Chen -- share their real-world advice for analysts who want to sell their project ideas to business executives.
At the SAS Premier Business Leadership Series in Orlando, Fla., Lousiana State Representative Chris Broadwater outlined the state's success with analytics-driven fraud detection and shared his vision for streamlined processes at the DMV, the healthcare system, and even the department of corrections -- all delivered via a centralized repository of rich customer data.
Organizations that are ready to leverage big data need to move beyond buzzwords and approach the challenges with a business focus. Peter Guerra, principal at Booz Allen Hamilton, shares his insight and experience in helping clients transition to Hadoop and embrace new decision support platforms.
At this year's Strata Conference / Hadoop World 2013, Michael Steinhart chats with Rackspace Product Marketing Manager Sean Anderson about Hadoop, cloud computing, and how the two come together for companies that want to undertake a "proof of value" project.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.