Analyzing 200,000 Russian Troll Tweets
It is universally agreed by US intelligence agencies that the Russians had a hand influencing the 2016 election using popular social media platforms. I am not interested in making this a political debate one way or the other.
I wanted to use some NLP tools to find some statistics and meta data about the tweets to see if they have any distinguishing characteristics, and whether some simple analysis on the actual data from Twitter can provide some insight into the Russian's efforts. Along the way, I will include some graphics like the wordcloud to the left to show what topics the Russians were most interested in talking about.
The data I am using for this write up came from over 200,000 tweets Twitter identified as coming from Russian trolls. Surely, there are far more that went under the radar. However, this is still a significant amount of data to glean some insight from.
20 Most Popular Hashtags
Many of the hashtags below were new to me, and I had to search quite a few of these terms to understand what they are referencing. I have tried to give a brief explanation if I didn't think the hashtags were completely self-evident. The Russian trolls favorite hashtags were the following:
- #politics: 3638 instances - This one seems pretty self-explanatory
- #tcot 2839 instances - When combined with its lowercase variation, this is actually the most used hashtag by the russian trolls. This one had never been on my radar before. #TCOT stands for "Top Conservatives on Twitter".
- #MAGA 2538 instances - Most people are familiar with this acronym, "Make America Great Again", which is Trump's campaign slogan.
- #PJNET 2147 instances - Stands for Patriot Journalist Network. There are alternative versions of this, which would bump the number closer to 2759 instances.
- #news 2046 instances
- #Trump 1851 instances
- #Merkelmussbleiben 1108 instances - This one might be suprising to some people, but the scope of the Russian propaganda campaign was not limited to just the United States. It stands for "MerkelMustStay" in English.
- #TrumpForPresident 1088 instances
- #WakeUpAmerica 1061 instances
- #NeverHillary 976 instances
- #IslamKills 930 instances - I personally found it interesting their Anti-Muslim efforts were prioritized over other political hashtags, and it is probable that Islamophobia is linked to their political efforts in the United States and Europe.
- #TCOT 926 instances - Capitalized variation of number 2 #tcot.
- #2A 919 instances - References the 2nd Amendment
- #Trump2016 913 instances
- #ccot 901 instances - Christian Conservatives on Twitter
- #TrumpPence16 791 instances
- #RejectedDebateTopics 718 instances
- #TrumpTrain 707 instances
- #BlackLivesMatter 697 instances
- #Hillary 697 instances
Interestingly, in the entire corpus of Russian Tweets, there is not one hashtag for #Hillary2016, #HillaryKaine, or any Hillary themed hashtag that wasn't negative in sentiment. I'll let the reader draw their own conclusions on that one.
100 Most Popular Links
It is interesting to see the sorts of sites they were actively linking. Many of the links below are link shortening services, and it is difficult to know where they were actually sending people for those domains. Youtube is also too generic to really glean much information, but some of the news sites are surprisingly middle of the road. Again, that could be by design, as they would have undoubtedly cherry picked certain articles that fit their narrative. On the other hand, you will notice some popular alt right domains in the list as well.
Below I have included the top 100 linked domains from the corpus.
[('http://bit.ly', 5693), ('https://youtu.be', 1259), ('https://twibble.io', 945), ('http://ln.is', 761), ('http://dlvr.it', 588), ('http://www.breitbart.com', 576), ('https://www.youtube.com', 503), ('http://wapo.st', 497), ('http://ow.ly', 489), ('http://buff.ly', 425), ('http://dailycaller.com', 382), ('https://goo.gl', 351), ('http://www.frontpagemag.com', 309), ('http://sh.st', 297), ('http://fb.me', 266), ('http://on.rt.com', 246), ('http://www.huffingtonpost.com', 227), ('http://www.wcvb.com', 210), ('http://ift.tt', 206), ('http://www.infowars.com', 205), ('http://truthfeed.com', 200), ('https://shar.es', 193), ('http://hill.cm', 191), ('http://dailym.ai', 190), ('http://thehill.com', 187), ('http://politi.co', 175), ('http://www.nytimes.com', 172), ('https://www.donaldjtrump.com', 167), ('http://www.thegatewaypundit.com', 156), ('http://fxn.ws', 149), ('http://www.foxnews.com', 147), ('http://usat.ly', 128), ('http://conscores.org', 127), ('http://washex.am', 126), ('https://www.facebook.com', 125), ('https://www.washingtonpost.com', 124), ('https://wikileaks.org', 122), ('http://smq.tc', 122), ('http://goo.gl', 121), ('https://amp.twimg.com', 121), ('http://cnn.it', 121), ('http://nypost.com', 117), ('http://wp.me', 116), ('http://www.politico.com', 114), ('http://Statespoll.com', 110), ('http://www.cnn.com', 109), ('http://www.cbsnews.com', 108), ('http://trib.al', 106), ('http://politics.blog.ajc.com', 104), ('http://www.lifezette.com', 103), ('https://vine.co', 102), ('http://huff.to', 96), ('https://www.periscope.tv', 95), ('http://youtu.be', 94), ('http://twib.in', 93), ('http://www.zerohedge.com', 93), ('http://www.usatoday.com', 91), ('http://www.youtube.com', 89), ('http://www.dailymail.co.uk', 86), ('http://paper.li', 86), ('http://newstalk1130.iheart.com', 84), ('https://waa.ai', 82), ('http://cbsloc.al', 81), ('http://nyp.st', 81), ('http://nyti.ms', 80), ('https://www.yahoo.com', 78), ('http://insider.foxnews.com', 78), ('http://tinyurl.com', 78), ('http://freebeacon.com', 76), ('http://conservativetribune.com', 71), ('http://www.redstate.com', 69), ('http://observer.com', 66), ('http://wh.gov', 65), ('http://lawnews.tv', 65), ('http://www.wsj.com', 64), ('http://www.westernjournalism.com', 61), ('http://www.chron.com', 61), ('http://rsbn.tv', 61), ('http://www.washingtonexaminer.com', 60), ('http://www.reuters.com', 58), ('http://bsun.md', 57), ('http://www.bizpacreview.com', 55), ('https://www.theguardian.com', 55), ('https://www.buzzfeed.com', 54), ('http://www.cbs8.com', 53), ('http://USFREEDOMARMY.COM', 53), ('http://wpo.st', 52), ('http://www.chicagotribune.com', 52), ('http://www.vox.com', 51), ('http://townhall.com', 51), ('http://www.newsmax.com', 51), ('http://www.wbaltv.com', 50), ('http://www.nbcchicago.com', 49), ('http://www.bbc.co.uk', 48), ('http://nydn.us', 48), ('https://medium.com', 47), ('http://www.local10.com', 47), ('http://on.wsj.com', 46)]
According to this dataset, the Russians started laying the groundwork for their propaganda campaign in the middle of 2014. Of course it is entirely possible that there were many russian accounts that went under the radar of Twitter's probe. The first tweet identified by Twitter as a definitive Russian Troll was dated July 17th, 2014. The dataset this blog post is based off of goes all the way to September 26th, 2017, but according to various news sources there's every indication that the Russians are still actively using social media to influence elections in the west. They are most likely tweeting to sway the U.S midterms right now.
So what sort of impact are we talking about here? It is impossible with the dataset to really predict the amount of people who were exposed to Russian propaganda on Twitter. One unassailable number we do have though is retweets. Most of their tweets went by without a single retweet, but tens of thousands of them had extraordinary reach.
Brace yourself. Russian Troll tweets were retweeted 2,302,521 times.