Those who can, do; those who can't, get certified.
Regular AllAnalytics.com readers might recognize this phrasing from our current Point/Counterpoint debate blog on the value of analytics-related certification. Scott Larsen, an independent consultant, exhorted readers, "Go and do something valuable instead of studying for a certification exam." Harkening back to his days as a data analyst at Google, he explained:
When I participated in hiring committees at Google, lots of certifications was generally considered a negative signal. Usually this came from a feeling a mal-prioritized time -- is there nothing better the job candidate could have done with his or her time? Why not accomplish something? You learn so much more by actually getting dirty doing things than you do studying for a test -- show us where you got dirty and what you learned and what you contributed.
Larsen's advice smacked me upside the head as I read about two data-mining competitions GE recently launched on Kaggle. Could participation in such competitions end, or at least diminish, reliance on certifications as a measure of knowledge? I like this idea -- a lot.
As we've previously explained, Kaggle is a data science marketplace that brings together companies or organizations with business challenges and folks with the desire to tackle them. These challenges are always about bringing out-of-the-box thinking to bear, whether to solve society's thorniest issues, address major industry gotchas, or just have a bit of fun with numbers.
In one new challenge, for example, GE aims sky high -- literally. In tandem with Alaska Airlines, it launched the Flight Quest challenge to address what it says is a $22 billion-a-year problem airlines face in managing efficiency. As a GE Aviation director explains in the video below, the goal is to develop an algorithm that delivers real-time flight profiles pilots can use for en-route decision-making. When pilots have such insight at their fingertips, they can make flights more efficient and reliably on time, or at least that's the stated intent.
The second of GE's latest quests deals with a more down-to-earth concern: healthcare. In its Health Quest, GE is working in partnership with Ochsner Health System to "promote an improved health care system experience for patient and family." But this challenge is about operational improvement, not medical care. The aim is to figure out ways to reduce the "$100 billion wasted annually in healthcare inefficiencies, distracting facilities from their primary focus -- patient care," GE said on the challenge site.
These are but two of many examples of the data-mining competitions going on right now on Kaggle, not to mention other venues. I call them out for their newness -- GE launched each within the last week -- and not because there's anything especially compelling about putting your mind to work in solving flight or healthcare inefficiencies. Neither is a bad goal, to be sure, but my point is that either could provide a great showcase for your talent. Even if you don't win a competition, being able to play around with the big-data sets available to contestants could be well worth the effort.
Next time you're tempted to sign up for a certification class, perhaps you ought to first take a gander at Kaggle. It'll make a great addition to your résumé -- and, who knows, you just might end up with some prize money, too.
Do you have any experience with data-mining competitions, of any size or scope? Share below.
Kq writes Such competitions surely can't hurt. Of course, in reality they're a clever marketing avenue for the sponsor. Geting the company name out there for free in press releases is great advertising.
I'd expect that just about anything a big company does these days has marketing in mind, at least partially, to help justify the expense of the public effort. However, given the examples of competitions sponsored by SAS and others mentioned in this thread, I'd presume that the sponsors expect some kind of valuable output from the competition itself.
... Which leads to my next qustion: I'd wonder if there are examples of actual analytics products now deployed, addressing real-world data challenges, that have been developed through these competitions.
Such competitions surely can't hurt. Of course, in reality they're a clever marketing avenue for the sponsor. Geting the company name out there for free in press releases is great advertising. Whether competitions trump certifications is another matter. Both should be advantageous it would seem.
That's a good question, I don't know. The Netflix and Heritage Health contests were corporate-sponsored, but open to all interested parties, academic, corporate, or (moonlighting?) individuals. Because of their length, those two marathons probably did not get many "student teams" participating as a formal part of their classwork, as some of the shorter duration contests do.
@Doug_Dame, your point is well taken. Do you think it'd be fair to call it the largest data mining competition in academia (vs. the corporate world)? I don't know the answer myself, but think there's a distinction worth nothing here.
Although prestigous and senior in tenure, I don't think it's accurate to refer to the Data Mining Cup as "the world's largest data mining competition."
The Netflix competition lasted almost 3 years, at the 8-month mark had 20,000 teams registered of which 2,000+ had made entries, and paid out prizes in excess of $1 million US.
The ongoing Heritage Health Prize, hosted on Kaggle, is a 2 year contest with a max payout of $3,230,000 and a minimum of $730,000 if the grand target is not achieved. It has more than 1400 teams registered and eligible to compete in the current final segment.
@Beth Kaggle does feature most interesting competitions to get smart people involved in solving data problems. I discovered that NASA does it, too. In October it announced the Launch Big Data Challenge Series for U.S. Government Agencies:
The Big Data Challenge series will apply the process of open innovation to conceptualizing new and novel approaches to using "big data" information sets from various U.S. government agencies. This data comes from the fields of health, energy and Earth science. Competitors will be tasked with imagining analytical techniques and software tools that use big data from discrete government information domains. They will need to describe how the data may be shared as universal, cross-agency solutions that transcend the limitations of individual agencies.
SAS Global Forum Executive Conference 2014 The Executive Conference is held in conjunction with SAS Global Forum, a SAS users technology event. Investing in thought leadership and technical training are two of the best moves a successful company can make so take advantage of the world-class speakers, sessions and discussions around Analytics, Big data, Risk, Fraud and Data management.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
David Tishgart, senior director of marketing and alliances at security provider Gazzang, explains the importance of data encryption for companies that are rolling out Hadoop environments to leverage big data analytics.
At the Strata Conference / Hadoop World 2013, Samuel Kommu, technical marketing engineer at Cisco Systems, shares some of the benefits that Hadoop brings to analytics platforms that leverage next-generation hardware. Kommu looks at big data operations that required 3,500 nodes in 2009, 2,000 in 2011, and now require only 64 nodes.
Wayne Thompson, manager of SAS Data Sciences Technologies, delivers a fascinating preview demonstration of SAS Visual Statistics, a tool that enables fast and flexible modeling against massive datasets on the fly. Visual Statistics will be made generally available in March, but you can see it here first.
At Strata/Hadoop World 2013, Cloudera CEO Tom Reilly discusses the new Enterprise Data Hub offering, explaining how it works with Hadoop, how it creates a single repository of full-history and full-fidelity data, and how it exposes that data to all users interested in exploratory analytics.
At this year's Strata Conference/Hadoop World 2013, SAS big data vice president Paul Kent presented a session on setting up Hadoop clusters for advanced analytics. We caught up with several audience members and recorded their impressions of the presentation.
In hearing directly from a doctorate-level Hadoop specialist, a healthcare data analyst, and a marketing executive, it's clear that big data analytics is a burgeoning field that cutting-edge companies are eager to explore.
At this year's Strata Conference/Hadoop World 2013 event, SAS VP of Big Data Paul Kent presented several sessions about modernizing and deploying advanced data analytics infrastructures based on Hadoop. In this video, he talks about the state of Hadoop adoption among enterprises today and looks out to the big data-driven applications of the future.
Companies that use SAS analytics tools for their traditional databases are looking to derive even more value by mining unstructured data. Data management platforms like Hortonworks enable that relationship by delivering an enterprise-ready Hadoop framework.
In this video, Shaun Connolly, vice president of corporate strategy at Hortonworks, explains how companies can incorporate Hadoop into their data analytics streams.
At the SAS Premier Business Leadership Series in Orlando, Manuel Sanchez, CRM Manager for Club Premier Aeromexico, explains the challenges and opportunities of transaction data. Using dozens of data sources among participating airlines and merchants, Club Premier creates robust customer profiles and works to maximize benefits for members and business partners alike while protecting individual privacy.
At SAS's October Premier Business Leadership Series (PBLS) in Orlando, attendees from the corporate and academic worlds joined thought leaders and analytics professionals to share insights and strategies around big data.
Will Hakes, CEO and co-founder of Link Analytics and keynote speaker at the SAS Analytics 2013 conference in Orlando, Fla., last month, talks candidly about the challenges that large enterprises face as they explore advanced analytics solutions. He also shares some practical tips for smoothing the transition.
At the SAS Analytics 2013 conference in Orlando, Bob Gladden, vice president for decision support and informatics at the Ohio nonprofit health insurance provider CareSource, explains how his company uses advanced analytics to keep administrative costs down and to identify at-risk patients for targeted healthcare initiatives.
At the Analytics 2013 conference in Orlando, Fla., two analytics experts from Dell -- global decision sciences manager Natalie Kortum and senior credit risk consultant Jack Chen -- share their real-world advice for analysts who want to sell their project ideas to business executives.
At the SAS Premier Business Leadership Series in Orlando, Fla., Lousiana State Representative Chris Broadwater outlined the state's success with analytics-driven fraud detection and shared his vision for streamlined processes at the DMV, the healthcare system, and even the department of corrections -- all delivered via a centralized repository of rich customer data.
Organizations that are ready to leverage big data need to move beyond buzzwords and approach the challenges with a business focus. Peter Guerra, principal at Booz Allen Hamilton, shares his insight and experience in helping clients transition to Hadoop and embrace new decision support platforms.
At this year's Strata Conference / Hadoop World 2013, Michael Steinhart chats with Rackspace Product Marketing Manager Sean Anderson about Hadoop, cloud computing, and how the two come together for companies that want to undertake a "proof of value" project.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.