Credibility Analysis: True or False?

Social networks are prime data sources for machine learning algorithms. A popular mechanism for gaining value from social media data is sentiment analysis -- using text mining to determine how people feel about a given topic.

You can experiment with it yourself by visiting Sentiment140, where you can analyze and calculate sentiment scores for tweets. Just enter one of your favorite brands to see what sentiment scores come up. I tried “Red Sox” and got a 78% positive sentiment!

But sentiment alone is not the end of the story. Research has shown that most of the messages posted on Twitter are truthful, but the service is also used to spread misinformation and false rumors, often intentionally. For example, immediately after the 2010 earthquake in Chile, when information from official sources was scarce, several false rumors posted and spread on Twitter exacerbated the sense of chaos and insecurity in the local population. Further, misinformation from social networks channeled through sentiment analysis as a component of machine learning algorithms can damage the reliability of these methods. Enter credibility analysis.

Striving for credibility
Credibility analysis is a relatively new but increasingly important area of research. Its purpose is to determine the credibility of information spread through social networks, using signals embedded in the content. This technology works for Twitter now, but it can be extended to include other social networks, as well as online news in traditional media and blogs.

Some work in this field centers on detecting deceitful campaigns. The Truthy Project at the University of Indiana is a good example of this approach. Other work looks for factors that can be used to approximate perception of credibility.

A seminal research paper for credibility analysis is Credibility Ranking of Tweets during High Impact Events by A. Gupta and P. Kumaraguru. Their research shows that extraction of credible information from Twitter can be automated with a high degree of confidence.

Tuning the detector
In order to train their credibility algorithms, Gupta and Kumaraguru used supervised learning classification. They recruited a group of human evaluators and set them to work on a training set of tweets. They could slot each tweet into one of four categories: (1) almost certainly true, (2) likely to be false, (3) almost certainly false, and (4) I can’t decide. The trick was to use feature engineering to define characteristics of tweets in each category. They identified four types of features:

  • Message-based. The length of a message, punctuation, URLs, user mentions, number and frequency of positive/negative sentiment words, hashtags, and retweet status.
  • User-based. Registration age, longevity of the account, number of followers, number of followees, and the number of tweets the user has authored in the past.
  • Topic-based. The proportion of tweets with hashtags and the proportion of positive and negative sentiments.
  • Propagation-based. Depth of the retweet propagation tree, or the number of initial tweets on a topic.

The number of features generated using the above categories may yield a large vector. It is therefore necessary to perform a best-feature selection process, which yields a smaller set. The sweet spot is around 15 features.

Here is a sample decision tree built for credibility classification of tweets. Class A is 'true' and class B is 'false.'
Here is a sample decision tree built for credibility classification of tweets. Class A is 'true' and class B is 'false.'

Profile of truth
For automated credibility assessment, researchers use supervised machine learning methods such as support vector machine (SVM), decision trees, Bayes network, and so on. As it turns out, the J48 decision tree algorithm (an open-source tool for data mining) yields the best results, with an accuracy rate of 86% correctly classified instances.

The work yielded these classification rules:

  • Tweets that do not include URLs tend to be related to non-credible information.
  • Tweets that contain question marks or smiling emoticons tend to be related to non-credible information.
  • Tweets that include negative sentiment terms are usually related to credible information.
  • A low percentage of tweets with positive sentiment terms tend to be related to non-credible information.
  • Non-credibility also occurs when a significant fraction of tweets mention a user.
  • Low-credibility information is mostly propagated by users who have not written many messages in the past. Users with a low number of followers tend toward non-credibility, as well.
  • Posts that are retweeted numerous times are related to credible information.

Members -- what do you think about the accuracy and importance of credibility analysis?

Daniel D. Gutierrez, Data Scientist

Daniel D. Gutierrez is a Data Scientist with Los Angeles-based Amulet Analytics, a service division of Amulet Development Corp. He's been involved with data science and big-data long before it came in vogue, so imagine his delight when the Harvard Business Review recently deemed "data scientist" as the sexiest profession for the 21st century. Previously, he taught computer science and database classes at UCLA Extension for over 15 years, and authored three computer industry books on database technology. He also served as technical editor, columnist, and writer at a major monthly computer industry publication for seven years. Follow his data science musings at @AMULETAnalytics.

Data Prep and Data Quality

Yes, there are ways to minimize the amount of time that data scientists spend on data preparation.

The Value Proposition of Streaming Analytics to the Enterprise

Streaming analytics, which is drawing an increasing amount of interest, helps enterprises by visualizing the business in real-time, cutting preventable losses, automating immediate actions, and detecting urgent conditions.

Re: When to use
  • 2/19/2014 7:34:48 PM

The additional survey would poll the human classifiers as to "why" they responded the way they did. So instead of just a black or white response, there would be some nuanced reasoning behind the process. I'm not sure whether this data could circle back and become part of the original algorithm since the survey responses could be very subjective and would only shed light on the sentiment, but not credibility.

Re: When to use
  • 2/14/2014 9:44:05 AM

It's interesting how all these processes are trying to emulate or replicate human intuition. How would the survey responses be integrated into the machine learning? 

Re: When to use
  • 2/13/2014 7:18:02 PM

@Michael, I think that in this sense credibility analysis is somewhat like unsupervised machine learning in that the process is rather subjective. For example, in hierarching clustering you can cut the tree at a different place and get very different clusters. Same is true with credibility analysis, you can come up with different rules and credibility classification will be different. Remember, the training of the algorithm is based on humans, ala Mechanical Turk. It could prove useful to add a short survey for the human classifiers to fill out. This may shed light on the "why."

Re: When to use
  • 2/11/2014 10:50:10 AM

Right, kiran. My concern is that we'll end up with a constant tug-of-war between spammers and alrorithmic learning, as we're currently dealing with in the email world. 

Re: When to use
  • 2/11/2014 1:47:03 AM

Interesting thing to ponder upon. What you said is true in the sense that we can easily fake and pretend to post up a truthful tweet. However since this is a research project and the things keep evaluating and evolving , this thing will be catered down as well. 


Re: When to use
  • 2/10/2014 12:40:47 PM

Well, an algorithm that scores 86% out of the box is still leaving 14% on the table, as it were. But with tweaking and tuning, it may be nudged above 90%. Either way, I agree with you that nothing software-based is going to come to 100%.

Re: When to use
  • 2/10/2014 12:38:57 PM

Not necessarily a contest, tomsg! I'm just genuinely curious about what makes a tweet credible or not, and how a machine can figure that out without human intervention. I'll admit that in the early days of Twitter, I was 'fooled' by bots and spammers once or twice. Now, tweeters are guilty until proven innocent. 

Re: When to use
  • 2/10/2014 12:36:15 PM

Thanks, Daniel. So essentially we're looking at characteristics that correlate, but we don't necessarily understand the causes. Definitely worth delving deeper into, though!

Re: When to use
  • 2/10/2014 10:36:17 AM

Determining "truth" isn't going to be easy. We can probably get fairly close to it, but not 100%. Kind of like our court system. With enough questions and data presented as evidence, we can figure out (or hope to) just what by the preponderence of the presented facts. is the truth. We're not always right as it turns out. but pretty close.

Re: When to use
  • 2/10/2014 9:27:10 AM

Really good questions Michael. Better than I thought of.

Page 1 / 2   >   >>