Right now, if you want to know how the country feels about Barack Obama or Mitt Romney, you have to rely on pundits’ intuitions or traditional opinion polls, conducted as they always have been — by phone, over the course of hours or days. There’s no direct way to check the pulse of millions of actual people, simultaneously and directly, second by second.
Twitter is launching a tool today that it says will fill that gap, and sort through the 400 million tweets a day from 140 million active users. Twitter and real-time search engine Topsy are launching the “Twitter Political Index,” a daily assessment of how Twitter feels about Obama and Romney, in an election cycle that’s being played out moment-to-moment on the social service.
The index, which Twitter has retained Democratic and Republican polling firms (the Mellman Group and North Star Opinion Research) to help perfect, appears to be the most serious and large-scale effort yet at a metric that has so far produced a lot of flops.
The rough version of how it works: Topsy pores through every single tweet in real time, determines which ones are about Obama or Romney, and then assigns a sentiment score to each tweet based on its content. That is, whether it’s positive or negative toward Obama or Romney, and just how positive or negative it is. Add all the data up together and you have something like a real-time approval score for Obama and Romney, determined by what tens of millions of people are saying, which Twitter is going to release daily at election.twitter.com.
Again, it’s scale of the data that makes this project so powerful. With Topsy’s access to the full Twitter firehose, it’s processing 400 million tweets a day in real-time. To compare, on Election Day 2008, there were 1.8 million tweets. There are that many tweets every six minutes today.
The Twitter Political Index leverages the growing science of sentiment analysis, in which computers — machines! — try to assess what the meaning or feeling of a piece of writing really is. Which isn’t easy — particularly for “short form content like tweets, which often lack context,” Topsy’s chief scientist Rishab Ghosh tells me. But Topsy’s algorithms agree with a randomly selected human 90 percent of the time on what a tweet means, validated over 30,000 tests. (It’s hard to get to 100 percent because humans often disagree on what a tweet means. Also, machines are still terrible at picking out or assessing sarcasm or irony. So, “Good job, Obama!” with a link to an article about crappy job numbers might fly past the algorithm as a positive tweet. )
The index’s scale runs from zero to a hundred — higher is better — and Obama and Romney are measured individually, not against each other. So both could be over 50, or both could go up simultaneously. Or down simultaneously. It’s not a head-to-head poll. “We are not actually are measuring whether people prefer Romney to Obama or not,” says Ghosh. It’s just “a broad sentiment measure of what’s going on.”
Just as important as what the Twitter political index is — is what it isn’t: an opinion poll. There are a couple key differences. For one, size. “It’s not just a sample. It’s the entire universe of people who are saying things on Twitter,” explains Ghosh. The other key point versus an opinion poll that “opinion polls ask a select group of people to answer specific questions”, but the Twitter political index is “not based on what people say they think when they’re asked something, so there is less of a selection bias. We’re just analyzing what they’re saying.” In other words, not only people are potentially more honest about their feelings on Twitter, there are no questions limiting what people’s responses are. They’re free-flowing. So it’s more like “a measure of the discussions going on in public spaces,” says Ghosh.
Ghosh thinks the Twitter political index solves another problem with polls: It reaches the millions of mobile-only users that many opinion polls don’t reach — half of Twitter usage is on a mobile device, Twitter’s Adam Sharp tells me. So it potentially includes demographics that are often underrepresented in opinion polls.
That said, opinion polls — which have tended to be quite accurate in presidential elections, despite running complaints about the absence of mobile users — rely on a random sample. Twitter’s users tend to be members of one of the most liberal demographic groups in America: Young people.
But Ghosh waves off the idea that Twitter perhaps underrepresents other aspects of the population — Ghosh says that “while it’s true that Twitter and internet usage is different from general population, it’s not true to say that this is somehow unrepresentative. The demographics are very very broad based. It’s a large proportion of US population.” Twitter’s theory is that anyone who is actively tweeting about politics is a likely voter. (Obviously, there are flaws in that assessment — I tweet about politics and political things, and I’m not a registered voter, nor do I intend to become one.)
Still the bigger question, the question is very simple: Will Twitter predict who’s going to win the election? Twitter won’t go that far. Sharp says, “I don’t think we would claim at this point that it is predictive,” though “someone who makes predictions based on this data, along with other data is going to be more informed than someone using one data source.”