That's a question Adam Ramey, a political science professor at New York University - Abu Dhabi, is trying to help answer. His survey research will "help scholars to understand the degrees to which American legislators deviate from or adhere to the policy preferences of their constituents." This month he found a way to help pinpoint an aspect of this using linguistic analysis and social media.
Ramey started by gathering and analyzing each of the last 1,000 tweets of each US congressperson. Working with R (an open-source software environment and programming language for statistical applications) and Twitter's API, Ramey documented every word and word stem each congressperson used -– and how many times they used them. Using data-scaling techniques to discover latent dimensionality, Ramey created one graph for the Senate and another for the House of Representatives, as shown below.
Immediately apparent, particularly in the data for the House, was a primary dimension (represented along the x-axis) portraying ideology. Ramey told The Washington Post:
Liberals might use some words more often than conservatives or… moderates use some words more than extremists. What scaling techniques seek to do is see whether a whole slew of words separate legislators based on some unobserved, underlying factors. For example, words like Obamacare, debt, Benghazi, and defund may be used to convey different individual messages, but at the end of the day they are all “conservative” words -- they are words used almost exclusively by conservatives.
A second dimension (shown along the y-axis) also presented. Its import? "Look at the words at the bottom," Ramey said.
"Indian," "seminar," "truck," "elementary," and names of people. At the top, you have "defund," "shame," "mistake," "pride." The folks at the top… are using Twitter as a partisan soapbox, whereas those on the bottom are using it as a means to speak to constituents. Moderates are talking more about constituency events and campaigns, whereas the extremists are using partisan taunting and rhetoric.While Ramey's model already accounts for the fact that some members of Congress tweet more than others, he is still working on further time discretization, "allow[ing] the ideology to shift from day-to-day using a rolling window of data," he told All Analytics in a Facebook conversation.
Consequently, Ramey's model -– like other linguistic analysis of social media content -– has tremendous potential for making political predictions. Joshua Tucker, a politics professor at NYU and co-director of the NYU Center for Social Media and Political Participation, discussed Ramey's work in the Washington Post blog, "Why compromise is more likely in the Senate than the House, in two simple pictures."
Tucker highlighted the sharp divergence in Ramey's House chart as partisanship increases. The senators, however, are clustered much more closely together. Tucker concluded: "As we look at exactly what the members of Congress want us to see, we find the Senate apparently a much more congenial place for compromise than the House."
Ramey agreed, telling All Analytics, "Constituency-oriented congressmen are always talking about the same stuff -– VFW, parades, bean suppers, etc. Partisan blowhards are using much more variety in their language -– 'hate,' 'shame,' 'repeal,' etc."
We should be worried, he suggested:
I think that my results overall suggest that polarization has now gone beyond simple ideological differences over policy and have led to our political elites literally talking about politics using vastly different vocabularies. We aren't even talking about the world in the same way anymore, and that bodes ill for the future and compromise in particular.
Ramey's work is the latest high-profile success of social media linguistic analysis. Expanding it to other areas may be worthwhile. Using linguistic analysis to find latent dimensionality among your followers, your employees, or others in your industry can shed light on their audiences, how they use social media, and even the future. Social data is virtually limitless; it only makes sense to use it to its full potential.
Do you agree or disagree? Share below.