Dataset for "Battle for Britain: Analyzing Events as Drivers of Political Tribalism in Twitter Discussions of Brexit”
In this study, we investigate how Brexit tribalism has unfolded over time on Twitter. The dataset contains a corpus of tweets posted to Twitter during a period of 32 months following the 2016 UK European Union membership referendum. The tweets were selected as a result of searching for keywords: firstly for "Brexiteer" and "Remainer" and secondly for "Brextremist" and "Remoaner". The CSV file in this dataset contains both sets of results. There are two columns in the file: timestamp and tweet text, which will be sufficient to replicate our process. Tweet IDs were removed to preserve user anonymity. First, we characterize the nature of the discussion by comparing language use patterns between tweets containing Brexiteer/Remainer and Brextremist/Remoaner keywords. We find that Brextremist/Remoaner are more commonly used in a derogatory way. We also find that all four group identity keywords are used more frequently over time, suggesting an increase in tribal interactions. Finally, we find evidence of a relationship between real‐life Brexit events and spikes in tribal responses online.”
Cite this dataset as:
North, S.,
Piwek, L.,
Joinson, A.,
2020.
Dataset for "Battle for Britain: Analyzing Events as Drivers of Political Tribalism in Twitter Discussions of Brexit”.
Bath: University of Bath Research Data Archive.
Available from: https://doi.org/10.15125/BATH-00812.
Export
Data
tweets_raw_all_no_ID.csv.zip
application/zip (587MB)
Creative Commons: Attribution 4.0
This file contains tweets for both sets of keywords (Brexiteer/Remainer and Brextremist/Remoaner). Tweet IDs have been removed to preserve user anonymity. There are two columns in the file: timestamp and tweet text.
Creators
Samantha North
University of Bath
Lukasz Piwek
University of Bath
Adam Joinson
University of Bath
Contributors
University of Bath
Rights Holder
Coverage
Collection date(s):
13 February 2019
Temporal coverage:
From 1 June 2016 to 13 February 2019
Documentation
Data collection method:
Data was originally extracted from Twitter's Historical Power Track API, via Crimson Hexagon's Forsight platform (Crimson Hexagon, 2019). We queried first for keywords “Brexiteer” and “Remainer,” then for “Brextremist” and “Remoaner,” producing two separate datasets in raw javascript object notation (JSON) format, including all tweet object fields. The tweet object encompasses fundamental variables obtained for each tweet such as unique id, date, and text, along with information about retweets, favorites, and hashtags. The raw JSON files were very large (over 80 GB), so we used a Python script to extract only the four key variables required for analysis: unique tweet id, date and time of tweet, text of tweet, and keyword associated with tweet. The original number of users in the data set was n = 844,881. The resulting tweets were stored in a csv file and included a total of n = 9,027,822 tweets posted between June 1, 2016 and February 13, 2019. For the purposes of this data upload, we retained only date and time of tweet and text of tweet, which is necessary to keep users anonymous.
Funders
Engineering and Physical Sciences Research Council
https://doi.org/10.13039/501100000266
Cyber Security Across the LifeSpan (cSalsa)
EP/P011454/1
Publication details
Publication date: 12 August 2020
by: University of Bath
Version: 1
DOI: https://doi.org/10.15125/BATH-00812
URL for this record: https://researchdata.bath.ac.uk/id/eprint/812
Related papers and books
North, S., Piwek, L., and Joinson, A., 2020. Battle for Britain: Analyzing Events as Drivers of Political Tribalism in Twitter Discussions of Brexit. Policy & Internet, 13(2), 185-208. Available from: https://doi.org/10.1002/poi3.247.
Contact information
Please contact the Research Data Service in the first instance for all matters concerning this item.
Contact person: Samantha North
School of Management
Information, Decisions & Operations