Text analytics software is rapidly becoming mainstream. Why? Just think about all of the data that your company creates internally and all of the data that is available to you from external sources. A large majority of this data is unstructured text. More and more, companies are coming to realize that they need a way to analyze and make use of this data so they can make better decisions for competitive advantage.
What is text analytics?
By my definition, text analytics is the process of analyzing unstructured text, extracting relevant information, and transforming it into structured information that can be leveraged in various ways. In general, text analytics solutions use a combination of statistical and natural language processing (NLP) techniques to extract information from unstructured data. NLP, a broad and complex field that has developed over the last 10 to 20 years, generally makes use of linguistic concepts such as grammatical structures and parts of speech to derive meaning from text. You can use text analytics to extract terms (i.e., keywords), entities (persons, places, locations, etc.), facts, events, concepts (a topic or idea), and even sentiment from unstructured text.
A range of industries use text analytics for a variety of purposes. Here are five use case examples:
Voice of the customer (VOC), or customer experience management and social media analytics: This is often the first use case that comes to mind with regard to text analytics. Companies primarily use VOC applications to determine what a customer is saying about or looking for in a product or service. Data for this kind of application comes from the internal sources such as call center logs, emails, surveys, and other forms of communication with the company as well as from external sources such as social media (blogs, tweets, forum posts, newsfeeds, and so on). Generally, companies use this kind of analysis to understand problems with their brands and what they could be doing differently. This can help reduce attrition. Companies can use VOC to help with competitive analysis.
Fraud: In this use case, companies use claim notes to help identify potentially fraudulent activity. For example, in a worker’s compensation claim, text analytics can help extract information about the potential claimant from the record. This might be information about the kind of worker the person is, whether a prior event happened, and so on. The company would combine this information with structured data to identify patterns of potential fraudulent claims. Then, as new claims come into the insurance company as part of its business process, they get scored. Those with a high probability of fraud flow to a special unit for further investigation.
Manufacturing or warranty analysis: In this use case, companies examine the text that comes from warranty claims, dealer technician lines, report orders, customer relations text, and other potential information using text analytics to extract certain entities or concepts (like engine or a certain part). They can then analyze this information, looking at how the entities cluster and to see if the clusters are increasing in size and whether they are a cause for concern, for example.
Customer service routing: In this use case, companies use text analytics to route requests to appropriate customer service representatives. For example, say you’ve written an email to a company while on its Website. You might have a question about a product, or you could be an existing customer with a complaint. The company can use text analytics for intelligent routing of that email to the appropriate person at the company.
Lead generation: In this use case, a person might tweet that he is unhappy with a certain product or service. Text analytics can pick this up and then a sales representative gets alerted (even one from another company) to contact that person to try making a sale.
I could cite myriad other examples, of course. What’s interesting and new is that these use cases increasingly involve some form of real-time analysis. I believe we’re going to see more examples of the real-time analysis of this unstructured big-data, a topic I’ll be writing about in future blog posts.
Join Fern Halper in an AllAnalytics.com Webinar, "Text Analytics Best-Practices: How to Extract Value From Your Unstructured Data," next Thursday, April 19, at 2:00 p.m. ET. Register now for the chance to qualify for a $10 Starbucks card!
Wow you had some really interesting words in that post. I have to research them in my downtime. But my guess is that there are specific engines that specialize in breaking down the English language for both traditional and semantic language to have a better understanding of the potential text phrases.
Hi Fern, And Thank you for explaining text analytics. Although the concept might seem simple, it might almost be too simple - leading to confusion- so I thank you for helping us all keep the definition at least in mind and hopefully in perspective.
I am still skeptical though of how well text analytics works when you consider semantics and tone and all of the rest to the little nuisances of language, but I like how text analytics is working from a straight forward approach as is practiced in customer service email routing for instance.
And I hope to be able to make your Webinar as well ! : )
Current trends in text analytics are towards building domain specific systems based on ontologies and specialized lexicons (or vocabularies). This way the system can capture a range of commonsense concepts and relations related to the domain. Word sense disambiguation techniques are used to resolve polysemy (abbreviations, lexical ambiguities....). But this is still work in progress.
I'm currently using GATE, ABNER, MontyLingua, NLTK, Stanford NLP Software, WEKA, in a biomedical research texts (articles) decision making system. The task consists of building a model to classify sentences or fragment of sentences (clauses, named entities...) into many rethorical categories. The motivation of the study is that most text mining systems use sentences or smaller semantic sequences as independent units for information extraction. The tools listed above are used in the linguistic preprocessing of the text (tokinization, part-of-speech, parsing, steming, lemmatization....) and classification (mostly with WEKA, using Naive Bayes, Support Vector Machine....., data visualization...).
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.
Elizabeth Barth-Thacker, a BI and informatics technology manager at Humana, tells us how her team is creating data transparency and building engagement with the business – with the help of an internal collaboration portal called Humanalytics.
Whether working in major league sports, financial services, or healthcare, analytics, and data, professionals are checking out how visual analytics and high-performance technologies can help them optimize their environments, shrink their cycle times, and improve decision making, as attendees at the recent SAS Executive Briefing in New York share with us.
Jim Davis, SVP and CMO at SAS, talks with us at a recent SAS Executive Briefing about how high-performance analytics and visual analytics take away the concerns over big-data and let companies get down to business with their data.