<?xml version='1.0' encoding='utf-8'?>
<eprints xmlns='http://eprints.org/ep2/data/2.0'>
  <eprint id='https://researchdata.bath.ac.uk/id/eprint/812'>
    <eprintid>812</eprintid>
    <rev_number>38</rev_number>
    <documents>
      <document id='https://researchdata.bath.ac.uk/id/document/13468'>
        <docid>13468</docid>
        <rev_number>2</rev_number>
        <files>
          <file id='https://researchdata.bath.ac.uk/id/file/41531'>
            <fileid>41531</fileid>
            <datasetid>document</datasetid>
            <objectid>13468</objectid>
            <filename>tweets_raw_all_no_ID.csv.zip</filename>
            <mime_type>application/zip</mime_type>
            <hash>01e427e71cdd9dbcb792dfffa9a8b8c9</hash>
            <hash_type>MD5</hash_type>
            <filesize>587939399</filesize>
            <mtime>2020-08-28 13:56:38</mtime>
            <url>https://researchdata.bath.ac.uk/812/2/tweets_raw_all_no_ID.csv.zip</url>
          </file>
        </files>
        <eprintid>812</eprintid>
        <pos>2</pos>
        <placement>2</placement>
        <mime_type>application/zip</mime_type>
        <format>other</format>
        <formatdesc>This file contains tweets for both sets of keywords (Brexiteer/Remainer and Brextremist/Remoaner). Tweet IDs have been removed to preserve user anonymity. There are two columns in the file: timestamp and tweet text.</formatdesc>
        <language>en</language>
        <security>public</security>
        <license>cc_by</license>
        <main>tweets_raw_all_no_ID.csv.zip</main>
        <content>data</content>
      </document>
    </documents>
    <eprint_status>archive</eprint_status>
    <userid>9061</userid>
    <dir>disk0/00/00/08/12</dir>
    <datestamp>2020-09-02 09:24:42</datestamp>
    <lastmod>2024-07-15 10:59:27</lastmod>
    <status_changed>2020-09-02 09:24:42</status_changed>
    <type>data_collection</type>
    <metadata_visibility>show</metadata_visibility>
    <creators>
      <item>
        <name>
          <family>North</family>
          <given>Samantha</given>
        </name>
        <id>S.North@bath.ac.uk</id>
        <orcid>0000-0002-0530-720X</orcid>
        <affiliation>University of Bath</affiliation>
        <contact>TRUE</contact>
      </item>
      <item>
        <name>
          <family>Piwek</family>
          <given>Lukasz</given>
        </name>
        <id>L.Z.Piwek@bath.ac.uk</id>
        <orcid>0000-0003-3291-4766</orcid>
        <affiliation>University of Bath</affiliation>
        <contact>FALSE</contact>
      </item>
      <item>
        <name>
          <family>Joinson</family>
          <given>Adam</given>
        </name>
        <id>A.Joinson@bath.ac.uk</id>
        <orcid>0000-0001-7019-7038</orcid>
        <affiliation>University of Bath</affiliation>
        <contact>FALSE</contact>
      </item>
    </creators>
    <title>Dataset for &quot;Battle for Britain: Analyzing Events as Drivers of Political Tribalism in Twitter Discussions of Brexit”</title>
    <subjects>
      <item>FB0080</item>
      <item>JN0010</item>
    </subjects>
    <divisions>
      <item>dept_ido</item>
    </divisions>
    <keywords>Twitter data, Social media, political tribalism, partisan politics, Brexit, Intergroup conflict, Polarization</keywords>
    <abstract>In this study, we investigate how Brexit tribalism has unfolded over time on Twitter. The dataset contains a corpus of tweets posted to Twitter during a period of 32 months following the 2016 UK European Union membership referendum. The tweets were selected as a result of searching for keywords: firstly for &quot;Brexiteer&quot; and &quot;Remainer&quot; and secondly for &quot;Brextremist&quot; and &quot;Remoaner&quot;. The CSV file in this dataset contains both sets of results. There are two columns in the file: timestamp and tweet text, which will be sufficient to replicate our process. Tweet IDs were removed to preserve user anonymity. First, we characterize the nature of the discussion by comparing language use patterns between tweets containing Brexiteer/Remainer and Brextremist/Remoaner keywords. We find that Brextremist/Remoaner are more commonly used in a derogatory way. We also find that all four group identity keywords are used more frequently over time, suggesting an increase in tribal interactions. Finally, we find evidence of a relationship between real‐life Brexit events and spikes in tribal responses online.”</abstract>
    <date>2020-08-12</date>
    <publisher>University of Bath</publisher>
    <full_text_status>public</full_text_status>
    <corp_contributors>
      <item>
        <type>RightsHolder</type>
        <corpname>University of Bath</corpname>
      </item>
    </corp_contributors>
    <funding>
      <item>
        <funder_name>Engineering and Physical Sciences Research Council</funder_name>
        <funder_id>https://doi.org/10.13039/501100000266</funder_id>
        <grant_id>EP/P011454/1</grant_id>
        <project_name>Cyber Security Across the LifeSpan (cSalsa)</project_name>
      </item>
    </funding>
    <collection_method>Data was originally extracted from Twitter&apos;s Historical Power Track API, via Crimson Hexagon&apos;s Forsight platform (Crimson Hexagon, 2019). 

We queried first for keywords “Brexiteer” and “Remainer,” then for “Brextremist” and “Remoaner,” producing two separate datasets in raw javascript object notation (JSON) format, including all tweet object fields. 

The tweet object encompasses fundamental variables obtained for each tweet such as unique id, date, and text, along with information about retweets, favorites, and hashtags. 

The raw JSON files were very large (over 80 GB), so we used a Python script to extract only the four key variables required for analysis: unique tweet id, date and time of tweet, text of tweet, and keyword associated with tweet. 

The original number of users in the data set was n = 844,881. The resulting tweets were stored in a csv file and included a total of n = 9,027,822 tweets posted between June 1, 2016 and February 13, 2019.

For the purposes of this data upload, we retained only date and time of tweet and text of tweet, which is necessary to keep users anonymous.</collection_method>
    <collection_date>
      <date_from>2019-02-13</date_from>
      <date_to>2019-02-13</date_to>
    </collection_date>
    <temporal_cover>
      <date_from>2016-06-01</date_from>
      <date_to>2019-02-13</date_to>
    </temporal_cover>
    <language>en</language>
    <version>1</version>
    <doi>10.15125/BATH-00812</doi>
    <related_resources>
      <item>
        <link>https://doi.org/10.1002/poi3.247</link>
        <type>pub</type>
      </item>
    </related_resources>
    <access_types>
      <item>open</item>
    </access_types>
  </eprint>
</eprints>
