The UK Department of Health and Social Care made its first tweet about the new coronavirus testing system on 25 January 2020. Less than a week later, the department tweeted its first announcement of two positive tests for COVID-19 in the UK, a foreshadowing of a series of events to come that could impact lives. It was going to have a profound effect. As the coronavirus spread, millions of other tweets were added to these initial tweets as people reacted to the rumors of lockdown, panic buying, and heartbreaking stories from around the world.
Time passed and tweets about masks, R numbers, and mass immunity were flooded with tweets of misinformation and conspiracy theories. Gradually, tweets started being made around the world waiting for the arrival of the vaccine and the situation returns to normal. Taken together, these tweets are a vast historical document - a modern diary by Samuel Pepps - that chronicles how life has changed during the pandemic. But with millions of tweets scrutinized, deciphering them all requires careful archiving.
My colleagues and I have archived these, creating a publicly accessible database of pandemic-related tweets that anyone can access. We hope this collection will help researchers and the public understand what has changed since the early weeks of 2020.
Twitter has already been used regularly as a research tool. A particularly interesting study showed that an increase in the use of words like "pneumonia" on Twitter in January 2020 indicated early warning signs of the spread of COVID-19 in Europe.
In other work, researchers have examined how world leaders turned to Twitter during the pandemic, and others created datasets to uncover how the public followed their COVID-19 policies. Another dataset from the University of Southern California contains 1.23 million tweets, including English, French, Thai, Indonesian, and more.
Then comes the study of misinformation on Twitter, which has been a major concern since the start of the pandemic. One study found that outright false claims spread faster than tweets containing partially false claims.
Another study found that unverified personal Twitter accounts had the highest rates of COVID-19 misinformation and that tweets with misinformation were more likely to use hashtags like #nCoV2019 than #COVID19.
Misinformation has also given rise to conspiracy theories. Investigations reveal that they claim the virus was developed as a biological weapon, that the vaccination program is part of a mass surveillance program, and that even the entire pandemic is a hoax.
These findings helped social media companies to ban persistent offenders, remove tweets containing misinformation, employ more fact-checkers, and add warnings to the disputed information.
All of these studies are useful in encouraging platforms to reach a public opinion and weed out misinformation, but most of their datasets are not publicly accessible and you need specialized skills to access and analyze them.
To overcome this obstacle, our team at Birmingham City University has developed the Trust and Communication: Coronavirus Online Visual Dashboard (TRAC: COVID). This is a collection of more than 840 million tweets in English that contain words and hashtags related to the pandemic. This currently covers UK tweets from January 2020 to April 2021 and will be increased as we get more data.