The dataset used for this project comes from the University of Maryland's IRAdS website, a program developed by Dr. Damien Smith Pfister dedicated to exploring "the rhetoric of computational propaganda that occurred on Facebook during the 2016 election." The IRAdS website contains 3,012 Facebook advertisements that the "Russian troll farm" known as the Internet Research Agency purchased in the run-up to the 2016 U.S. presidential election.
The metadata and images that were extracted from these social media advertisements was released by Facebook to the House Intelligence Committee, who then released them to the public.
The original dataset featured costs in both U.S. Dollars and Russian Rubles. I decided to convert the amounts in Rubles to Dollars based on the conversion rate as of the week of March 8, 2019. Of course, the conversion rate would have been in constant flux during the three years when these ads were bought, but I decided that it would still be more instructive for any visitor to this site to see the costs in dollars rather than rubles. If one were to see that an ad was bought for 5000 rubles, they probably would not have a good estimation of how much that’s really worth (approx. $78) and how such a small cost in U.S. dollars could yield thousands of impressions and clicks.
Because I was having problems getting the CSV file to work in OpenRefine, I had to save it as an XLS file first. After this change, I was able to clean the data using OpenRefine so that I could organize it into a form that was more workable for my purposes. Primarily, this meant splitting multivalue cells into separate rows. For the tag frequency visualization, I split the Tag column cells into separate rows. In order to create my map, I first used OpenRefine to exclude all the records that had null values for the Location column. Then I exported that into a XLS file, which I then reuploaded into a new session of OpenRefine. I split the Location column cells into separate rows to make each location discrete, and I copied the cells down so that the newly created rows still aligned with the rest of the metadata for each record.
For the sake of transparency, I have uploaded all of the various XLS files that I adapted from the original CSV to my github page.
The visual and aesthetic elements of this website are indebted to two powerful tools. All visualizations were created using Tableau Public and the website was created using Mobirise 4.9.6. The images for each visualization on this website link to the respective URLs for the visualizations on Tableau's website. By visiting this site, users can interact more closely with the data.
For my visualizations, I have chosen to do a map of targeted ads, a treemap illustrating which months featured the highest number of IRA ads as well as the most IRA money spent, a scatterplot to compare clicks vs impressions vs cost, an area graph to give an alternate perspective in support of the clicks vs impressions correlation, and a simple bar graph to illustrate the frequency of tags in IRA ads. I had initially contemplated using a network graph I created in cytoscape to illustrate which tags appeared together most frequently, but I had a harder time making the network graph appear as easily legible as the bar graph.
I also originally contemplated doing a map that highlighted states where ads were targeted, but I decided against it. The map that highlights targeted cities was more effective to show, for example, that Ferguson, MO was heavily targeted as opposed to simply seeing that Missouri as a state was targeted.
Finally, I'd like to thank Dr. Miriam Posner for everything she has taught me in her Digital Humanities 201 course and for her guidance on this project.