A couple of days ago I saw a link to a new study suggesting that Russian Twitter bots had cost Hillary Clinton three percentage points in the 2016 election. This seemed pretty unlikely to me, so I was curious to read the paper and see what the authors really said.
Long story short, they amassed a huge database of tweets starting a month before two events: the Brexit vote and the American presidential vote. They identified human tweeters vs. bots using criteria that seem pretty reasonable. They confirmed that the share of human tweets in a geographical area closely predicted the vote in that same area. Then they applied a big ol’ econometric model to figure out how much influence bots had.
Before we get to that, though, here are some of the conclusions they drew about Twitter:
- With rare exceptions, retweets are all done within two hours of the original tweet. After that, your tweet is effectively dead.
- Bots don’t retweet much—about a tenth as much as humans.
- Humans retweet other humans much more than they retweet bots.
- Bots generate a lot of activity from humans who are on their side: each bot tweet, on average, produces two new human tweets.
- Bots are more effective than humans at generating tweets from humans who are on the other side.
And now for the net effect of bots. The authors calculate actual tweet traffic and then compare it to a model counterfactual in which bots don’t exist. Generally speaking, there are bots on both sides of any issue, and they mostly cancel each other out. But not totally. The tweet pattern they predict in the counterfactual is a little different than the actual tweet history:
At this point, I think I was right to be skeptical. The bot effect is small and random, and depends heavily on the precise specification of their model. The biggest effect appears to be that in the counterfactual, pro-Trump tweets dwindle away in the two weeks before Election Day, but in the real world, where bots were working tirelessly away, pro-Trump traffic stays pretty strong.
However, the authors ignore all that and look solely at tweet traffic on the day before the election. Oddly, pro-Clinton traffic spikes way upward in the five days before the election while the pro-Trump traffic dies off in the day before. As a result, for this single day there’s more pro-Clinton traffic than pro-Trump traffic, and the difference between pro-Trump and pro-Clinton traffic is bigger in real life than it is in the bot-free counterfactual. This suggests that bots helped Clinton on the last day before the election, and the authors estimate that the bots contributed to an increase in the Clinton vote of 3.23 percentage points.
This is pretty thin stuff, but if Fox & Friends picks up on it then Trump will finally have his excuse for losing the popular vote: the bots did it! For the rest of us, I wouldn’t take this very seriously. In fact, even for the authors it’s more of a passing comment than a real conclusion of their paper. There’s not a ton of evidence for their model; there’s very little evidence for the causal effect of higher tweet volume on voting; and there’s no evidence at all to support the idea that only Twitter traffic on the last day before the election makes a difference. All in all, there’s some interesting stuff in this paper, but the effect of bots on voting behavior isn’t part of it.