Text Mining Is on the Rise


A recent survey suggests text mining is on the increase among data mining professionals.

In the 4th Annual Data Miner Survey conducted by Rexer Analytics, more than half of data miners have added or are adding text mining to their arsenals.

“About a third of data miners currently incorporate text mining into their analyses, while another third plan to do so,” wrote Karl Rexer, company founder.

In another recent report, Seth Grimes, founder of business analytics strategy consulting firm Alta Plana, indicated text and content analytics grew by 25 percent last year and is expected to grow at a similar rate over the next two years.

Text analytics and text mining are similar, Grimes said in an email.

“Ninety-eight percent of users don't see a significant distinction between text mining and text analytics. Most users now identify the technologies under the ‘text analytics’ label, and it's mostly folks from a data-mining background who use ‘text mining,’ as an extension of data mining to text.”

As a result, the increase in text mining identified by Rexer and the overall growth of text analytics are part of the same trend, Grimes insists:

    So everywhere ‘text mining’ is used in that excerpt, folks other than data miners would have used ‘text analytics’ and would see an affirmation of the growth of text analytics.

    To be clear: the growing incorporation of text into data-mining efforts and the use of data mining techniques for analysis of text is, for 98 percent of the world including me, a statement about text analytics.

And there is more. In his report, Grimes concluded social media analytics were a major driver of the trend -- a conclusion that seems to be borne out in the data mining survey.

In the Rexer report, more than half of the respondents indicated the focus of their data mining efforts was either sentiment analyses, a main focus in social analytics, or social network analyses.

However, Marshall Sponder, author of Social Media Analytics: Effective Tools for Building, Interpreting, and Using Metrics and an AllAnalytics.com blogger, worries that many moving into this field may, at first, be disappointed.

“As 90 percent of social analytics data consists of unstructured information, most text analytic implementations prove to be unsatisfactory in addressing the increasingly sophisticated needs of social marketers,” Sponder says. “With structured data, text analytics are much more useful, and I believe this is what the studies cited here are based on.

“It could be that as more people evolve in their sophistication for analytics it will force the text analytics platforms to improve in their capabilities with unstructured data to match what they are able to achieve more easily with structured information.”

Are you using text analytics? If so, are you doing so for structured, unstructured, or both types of data?

Shawn Hessinger, Community Editor

Shawn Hessinger is a community manager, blogger, social media and tech enthusiast, journalist, and entrepreneur based in Northeastern Pennsylvania. He serves as community manager and blogger for BizSugar.com, a business news and information Website, and contributes regularly to the online business news source, Small Business Trends. He is the founder of PostRanger.com, an online content and media community, and has provided blogging and social media services and consulting for companies all over the world. He researches and writes on a variety of business, Internet-related, and other tech topics including business intelligence and analytics. He is also keenly interested in computer-aided data management as it relates to his various online ventures. A newspaper journalist with more than 11 years experience as a reporter and then managing editor, Shawn began blogging in 2006 and now provides a variety of consulting and outsourcing services in Search Engine Optimization, Web development, and online marketing to companies large and small. He is a strong advocate for the use of BI and related computer data management in business decision making, whether using software as a service (SaaS), cloud, or other applications, and in the opportunity these technologies provide to transform small startups and larger established businesses alike.

BCBSNC, SAS Team on Advanced Analytics

The key to improving heathcare outcomes is to look at individual needs, the companies say.

Spoofing, Privacy Greatest Barriers for Biometrics

In Wednesday's e-chat, we discussed the analytics of identification and whether the technology might find a bigger role one day in marketing intelligence.


Re: Text Mine Your Friends on Facebook
  • 12/12/2011 11:04:52 PM
NO RATINGS

Yea. Sounds like Google Analytics... but for your friends. Maybe next, they'll provide the ability to create goals on types of friends, ages, etc.

Re: Text Mine Your Friends on Facebook
  • 12/12/2011 10:43:32 PM
NO RATINGS

It was very interesting because I could see  how many female friends I had vs. male friends, age groups, where I knew them from etc.  If I wanted I could seperate the groups and see trends on careers etc.  It's not totally accurate, but it was interesting. 

Re: Text Mine Your Friends on Facebook
  • 12/12/2011 10:29:18 PM
NO RATINGS

Man, I'd love to get a tutorial on the algorithms behind that app (if it works as well as it's supposed to). And I'm not much of a algo kinda guy...

Text Mine Your Friends on Facebook
  • 12/2/2011 11:02:08 PM
NO RATINGS

Now you don't have to be a corporation to text mine.  You can now text mine your friends on Facebook using the 'Wisdom' application. 

Wisdom is described as an application that enhances Facebook by helping users make sense of the amount of data available through their social networks. At the MicroStrategy World 2011 user conference in Monte Carlo, FacebookCIO Tim Campos made a presentation discussing the impact of social media and how social is transforming industries today.[2][3]  Source: http://en.wikipedia.org/wiki/Wisdom_(application)

Now, I can even text mine myself and learn what I'm all about. 

 

Re: Text mining
  • 11/19/2011 11:51:59 AM
NO RATINGS

Text analytics cannot be used for information retrieval in all cases, as available data is not typically organized. I hope text mining will develop soon enough to analyse unstructured datasets.

Re: Text mining
  • 11/17/2011 5:13:51 PM
NO RATINGS

Maryam,

Online video marketers are already savvy about use of document names, tags and captions as tools to help visitors find relevant video content via search. However, at least on the Web, there is no control over how such content is added, hence unstructured data. Others viewing the content can then add to that classification either by sharing it with their own tags and captions or by adding comments etc. which add to the context of the video. The degree to which text analytics is able to adequately read and correctly interpret this context will be the first serious step toward mining unstructured video data and, in a crude way, it is already taking place.

Text mining
  • 11/17/2011 10:32:45 AM
NO RATINGS

Shawn hopefully as inroads are made in analyzing text it will carry over to other unstructured data in the future. Though I think video will ultimately be the hardest to put a tool around.

Re: Text this or text that
  • 11/15/2011 3:42:54 PM
NO RATINGS

Interesting infographic, Sergei, and a great example of how text analytics can be used to look at trends with added interpretation. I think many users understand the limits of text analytics but the possibilities, as you point out, should not be overlooked. For example, even if I use a tool to discover the number of mentions of my brand, product or trend, surely any tool that can help me understand the conversation going on around me about that topic is useful? Is it a perfect tool doing all the analysis for me and drawing all necessary insight? Certainly not. But it still may help me understand and distinguish the conversations I should be paying attention to.

Re: Text this or text that
  • 11/15/2011 12:23:31 PM

So does that imply we should be spending a lot longer on this stuff than we often are?  

I find the main reason why people don't spend time is they think (and have been sold text analytics on the basis) of getting a "quick answer" or "cheaper alternative to market research" than if they had to set up and pay for focus groups, themselves.

Could it be, and I have suggested as much, the marketing speak of the vendors, who are trying to sell their products and services, play into the immediate needs of businesses that are clueless, in order to support unrealistic expectations?

Like you, I also noted that tredtional market research has it's own bias, and in a way, can be almost as "plastic" as social data.  A recent conversation with a former worker for http://www.synovate.com/ gave me quite a view of the insides of such research - and from my point of view, has as much bias as anything in Social Media - it's all how you ask your questions.

In fact, depending on how questions are asked, you'll get a different answer - true of social media, true of market research.

So, maybe we do need to slow down, figure this all out, first, then figure out what it realistically costs to do the work and realistically, what it should be priced at, and then look for people mature enough to deliver it to.

At least, that's my current thinking today.

Re: Text this or text that
  • 11/15/2011 12:08:24 PM

@webmetricsguru: I hear you. In fact, in data analytics against structured data (my current focus), the same misbehavior is frequent but hidden in other places. By ignoring the implied bias in projecting from sample to full population, or by choosing a fast-but-wrong algorithm that destroys the intent of a statistical model, or by constraining the question to the point where it becomes a self-fulfilling prophecy, there are many ways to run an analytics project and get what you wanted if you don't publish your methods for peer review.  Speed of business often hides sloppiness in scientific methods.

Page 1 / 2   >   >>
INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +