Text analytics, my research shows, has become a "have to have" technology for a majority of companies that use it.
So I've learned from the many companies I've talked to as I prepare Hurwitz & Associates' Victory Index for Text Analytics, a tool that assesses not just the technical capability of the technology but its ability to provide tangible value to the business (look for the results of the Victory Index in about a month). However, they also said challenges abound -- and those don't necessarily involve the text analytics software itself.
Here’s a quick look at five challenges users said they most often run into with text analytics.
Data access. Often, companies will want to utilize more than one source of unstructured data for analysis, but gaining access to this data can be challenging. This is more than getting a hold of the Twitter fire hose for customer intelligence analysis. This is about the right to use internal or cross-company data stores like institutional document repositories in the face of corporate politics or delays due to operational procedures, like making formal requests for the data from IT.
Managing expectations. In some organizations, text analytics can leave management with the idea that you can simply plug in the software, feed it text data, and have it automatically give you the answers. While you may be able to get some high-level answers this way using tools tuned for social media, the reality is that most of the time you’ll have to interact with the software, especially when it comes to building a taxonomy (see No. 5). Text analytics tends to be more semi-automatic than automatic.
Trusting the data. On the flip side of managing inflated expectations is the need to establish trust in the data. This challenge can manifest itself in terms of data quality and as a cultural issue.
Determining data quality for unstructured data is hard for many reasons including the fact that words have multiple meanings and unstructured text can be noisy with typos, colloquialisms, and so on. Often times, with text data you’re going to get about 70 percent to 80 percent accuracy. That can be a challenge for some people.
Using text analytics in decision-making also requires a cultural change, which can be difficult. For example, in organizations that are used to classifying content manually, moving to a semi-automated approach can be a big shift and people might not believe the classification schemes. They'll be skeptical -- sometimes because of the way the analysis is presented. For instance, structured data might indicate that people are buying a wireless company's phones. Because sales are up, executives might not believe that the unstructured data in call center notes or on the Web shows negative sentiment about the phones -- that they're buying them only because their choices are limited. You need to be able to tell the story and make people understand the kind of analysis you can do with this new source of data. This can take time.
Building the skills. The skills you’re going to need to analyze text will vary depending on the problem you’re trying to solve. Some people claim that you need to understand your industry. Others say being analytical is enough. If the goal is using a social media analytics tool to do some high-level analysis on brand reputation, you'd likely need only a small amount of training. But if you’re trying to combine structured and unstructured data to increase the lift of a predictive model, then you'll need deeper skills development. Regardless of the issue you’re looking to address, text analytics involves dealing with a new form of data and there is going to be a learning curve involved in knowing what to do and how to apply it to the business. You’ll also have to know how to ask the right kind of questions. This is a learning process.
Taxonomy issues. A taxonomy is a method for organizing information, or sometimes categories, into hierarchical relationships. Because a taxonomy defines the relationships between the terms a company uses, it makes it easier to find and then analyze text. Some organizations hire people skilled in taxonomy development to build it. Some vendors provide out-of-the-box taxonomies for certain industries. Even so, you’re going to have to deal with the vagrancies of the terminology in your industry, and there is going to be upfront work to specify this terminology. Many end-users feel that the necessary taxonomy development, or refining their categories (if that is the way you’re ultimately building a taxonomy), is difficult. It can take more than one try. Companies need to plan for this.
So remember, new ways of doing things generally involve challenges, and text analytics is no exception. Overcoming these challenges will require time, training, and persistence.
What are your biggest challenges (or what do you imagine them to be) regarding a text analytics implementation? Share below.
Great post! A timely topic, becoming even timlier. In my view, the taxonomy is the biggest of the five you've listed. There are so many variations on ways to create a taxonomy, because the data are unstructured, so there's little embedded guidance as to how to structure your taxonomy. This, I believe, is why organizations absolutely must commit to documentation, standards, and repeatable measures in their text analytics activities. Otherwise it's going to be very difficult to meet the other challenges (such as believing/trusting the data and results).
Interesting. Text Analytics software used for ROI should be deployed and compete with other methods to solve problems. Organizations can select Text Analytics if it can produce a better result or better problem solve versus other analytics methods
Yes, it's true that turning the soft benefit of text analytics into an ROI is difficult, as Beth mentioned. Or that today, most text analytics projects do not have ROI as Fern says, but in at least 2 of the 5 reasons Fern lists in her post http://www.allanalytics.com/author.asp?section_id=2013&doc_id=242200& (customer service routing or deflection and lead generation), it is pretty common to use ROI models to evaluate the investment in text analytics. Obviously, this is more difficult in voice of the customer or customer experience optimization projects. This is unfortunate because these are probably the initiatives that bring the highest ROI to organizations, as usually they contribute to both revenue growth and cost reduction. However, I think this is more because predictive models accounting for "soft" variables are still in development. I think the analogy with weather forecasting that I wrote about here http://www.expertsystem.net/blog/?p=266 is valid. When these new models are developed, it will be a great day for text analytics because adoption will grow significantly. We'll see if I'm right :). In any case, thanks to both of you for the high quality work you are doing on this site.
Thanks @lscaqlarini for your kind words about the post. You bring up a good point about ROI. Interestingly, a large number of the companies I speak to don't have to do an ROI analysis for their text analytics software.
@lscagliarini, first off, thanks for jumping onto the AllAnalytics.com message boards. Welcome, and I'm glad you found value in Fern's post! You raise a great point, as well as a potential difficulty, I'd say. The soft benefits associated with being able to analyze text quickly and efficiently are fantastic. But wouldn't you say they're awfully difficult to work into a formal ROI statement?
Perfect analysis. I would add, from my experience, that ROI calculation for text analytics projects, especially the most strategic ones, often requires accounting for the cost of ignorance (i.e. loss of potential revenue or cost savings) that cannot be easily calculated using traditional data based financial models. What are the costs of not interjecting a complaint about your product in time or finding out late (and too late) about an existing patent in an area related to your R&D project, or not being able to immediately identify an employee with the right skill set to address an urgent organizational issue? This is not so different from what was happening to supply chains 15-20 years ago before the right models to deal with their complexity were developed. In any case, this is a great post, one that I'm sure to refer to again.
I guess that would depend on a)what tool was used and how finely tuned the sentiment was and b)who the analyst was that was doing the analysis and whether they tuned the sentiment. I would be suspicious of anyone making the claim that mentions somehow equates to who won a debate in any event - I'd have to see the other part of the analysis! Goes to show you how someone can use an analysis and say what they want from it.........
Alexis, right -- lots of front end work required to make sure you're working with quality data. In a way, that's no different with text analytics than any other sort of analytics, at a basic level, at least.
@Noreen, for some reason I'm having a hard time reconciling the idea that a grocer than lets a store become a "nasty, dirty place" would be bothering with text analytics. That seems a disconnect to me -- why invest in measuring measure customer sentiment using advanced analytics tools if you can't even bother to pick up a broom and a mop?
2014 VA Interactive Roadshow -- AtlantaThe 2014 VA Interactive Roadshow will feature SAS® Data Management and SAS® Visual Analytics experts covering topics like prepping data for VA and VA integration with SAS® Office Analytics. This year's events will keep presentations at a minimum and focus on giving attendees hands-on exposure to the latest version of VA.
NRF Retail's Big Show 2015The flagship industry event of the National Retail Federation, Retail's Big Show is an annual event held over four days in New York City. As the world's leading retail event, the Big Show brings together 30,000 retail professionals and vendors from more than 86 countries, and features more than 100 education sessions, 270 speakers and 550 exhibitors. The conference connects retail solution providers with retail executives searching for the most effective solutions, tools and technologies.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
The hospitality industry gathers massive amounts of customer data, and mining that data effectively can yield tremendous results in terms of improved CRM, better-targeted marketing spend, and more efficient back-end processes. Roger Ares, vice president of analytics at Hyatt Corp., discusses the ways he and his staff use big data.
Charged with keeping track of travel assets, including employees, iJET International relies on data management best-practices and advanced analytics to keep its clients in the know on current and potential world events affecting travel, Rich Murnane, Director of Enterprise Data Operations & Data Architect, told All Analytics in an interview from the 2014 SAS Global Forum Executive Conference.
Jason Dorsey, chief strategy officer for the Center for Generational Kinetics and keynote speaker at last month's SAS Global Forum 2014, describes how Gen Y professionals are enhancing the makeup of multigenerational analytics organizations.
From analytics talent development to the power of visual analytics, All Analytics found a variety of common themes circulating throughout the exhibition floor and session discussions at the 2014 SAS Global Forum and SAS Global Forum Executive Conference events held last month in Washington, DC.
Talking with All Analytics live from the 2014 SAS Global Forum Executive Conference, Eric Helmer, senior manager of campaign design and execution for T-Mobile, discussed the importance of customer data -- starting internally -- in devising the mobile operator's marketing plans.
The big-data analytics market can be a confusing place. Among the vendors vying for your dollars are traditional database management providers, Hadoop startup services, and IT giants. In this video, All Analytics editors Beth Schultz and Michael Steinhart sit down in a Google+ Hangout on Air with Doug Henschen, executive editor of InformationWeek. Henschen discusses use cases for big-data analytics, purchase considerations, and his recent roundup of the top 16 big-data analytics platforms.
At the National Retail Federation BIG Show last month, All Analytics executive editor Michael Steinhart noted a host of solutions for tracking and analyzing customer activity in retail stores. From Bluetooth beacons to RFID tags to NFC connections to video analytics, retailers must find the right combination of tools to help optimize the shopper experience, streamline operations, and boost revenues.
The days when historical shipment trends and gut feelings were enough to forecast retail demand accurately are long over. SAS chief industry consultant Charles Chase outlines the benefits of pulling real-time sales information from point-of-sale and product scanner systems, then flowing that data into dynamic forecasting tools from SAS.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.