OMG U got flu? Analysis of shared health messages for bio-surveillance

Type: Article

Publication Date: 2011-01-01

Citations: 116

DOI: https://doi.org/10.1186/2041-1480-2-s5-s9

Abstract

Micro-blogging services such as Twitter offer the potential to crowdsource epidemics in real-time. However, Twitter posts ('tweets') are often ambiguous and reactive to media trends. In order to ground user messages in epidemic response we focused on tracking reports of self-protective behaviour such as avoiding public gatherings or increased sanitation as the basis for further risk analysis. We created guidelines for tagging self protective behaviour based on Jones and Salathé (2009)'s behaviour response survey. Applying the guidelines to a corpus of 5283 Twitter messages related to influenza like illness showed a high level of inter-annotator agreement (kappa 0.86). We employed supervised learning using unigrams, bigrams and regular expressions as features with two supervised classifiers (SVM and Naive Bayes) to classify tweets into 4 self-reported protective behaviour categories plus a self-reported diagnosis. In addition to classification performance we report moderately strong Spearman's Rho correlation by comparing classifier output against WHO/NREVSS laboratory data for A(H1N1) in the USA during the 2009-2010 influenza season. The study adds to evidence supporting a high degree of correlation between pre-diagnostic social media signals and diagnostic influenza case data, pointing the way towards low cost sensor networks. We believe that the signals we have modelled may be applicable to a wide range of diseases.

Locations

  • Journal of Biomedical Semantics - View - PDF
  • PubMed Central - View
  • arXiv (Cornell University) - View - PDF
  • CiteSeer X (The Pennsylvania State University) - View - PDF
  • Europe PMC (PubMed Central) - View - PDF
  • PubMed - View
  • DataCite API - View

Similar Works

Action Title Year Authors
+ Mining of health and disease events on Twitter: validating search protocols within the setting of Indonesia 2016 Aditya Lia Ramadona
Rendra Agusta
Sulistyawati Sulistyawati
Lutfan Lazuardi
Anwar Dwi Cahyono
Åsa Holmner
Fatwa Sari Tetra Dewi
Hari Kusnanto
Joacim Rocklöv
+ PDF Chat On the ground validation of online diagnosis with Twitter and medical records 2014 Todd Bodnar
Victoria C. Barclay
Nilàm Ram
Conrad S. Tucker
Marcel Salathé
+ PDF Chat Enhancing Twitter Data Analysis with Simple Semantic Filtering: Example in Tracking Influenza-Like Illnesses 2012 Son Doan
Lucila Ohno‐Machado
Nigel Collier
+ Detecting Influenza Epidemics on Twitter 2021 Katerina Katsani-Geronymaki
Polyvios Pratikakis
+ Detecting Influenza Epidemics on Twitter 2021 Katerina Katsani-Geronymaki
Polyvios Pratikakis
+ Syndromic classification of Twitter messages 2011 Nigel Collier
Son Doan
+ PDF Chat Mining and Validating Social Media Data for COVID-19–Related Human Behaviors Between January and July 2020: Infodemiology Study 2021 Ashlynn R. Daughton
Courtney D. Shelley
Martha Barnard
Dax Gerts
Chrysm Watson Ross
Isabel Crooker
Gopal Nadiga
Nilesh Mukundan
Nidia Yadria Vaquera Chavez
Nidhi Parikh
+ Epidemic Intelligence for the Crowd, by the Crowd (Full Version) 2012 Ernesto Diaz-Aviles
Avaré Stewart
Edward Velasco
Kerstin Denecke
Wolfgang Nejdl
+ Forecasting and Prevention Mechanisms Using Social Media in Health Care 2020 Paraskevas Koukaras
Dimitrios Rousidis
Christos Tjortjis
+ ANALYZING COVID19 TWEETS USING HEALTH BEHAVIOURS THEORIES AND CLASSIFICATION MODELS 2021 Boma Graham Kalio
+ PDF Chat Early Detection and Control of the Next Epidemic Wave Using Health Communications: Development of an Artificial Intelligence-Based Tool and Its Validation on COVID-19 Data from the US 2022 Teddy Lazebnik
Svetlana Bunimovich‐Mendrazitsky
Shai Ashkenazi
Eugene Levner
Arriel Benis
+ Detecting Early Warning Indicators of Covid-19 Pandemic in the Context of United States: An Exploratory Data Analysis 2022 Md Morshed Jaman Adnan
Knut Hinkelmann
Emanuele Laurenzi
+ PDF Chat The Healthy States of America: Creating a Health Taxonomy with Social Media 2021 Sanja Šćepanović
Luca Maria Aiello
Ke Zhou
Sagar Joglekar
Daniele Quercia
+ The Healthy States of America: Creating a Health Taxonomy with Social Media 2021 Sanja Šćepanović
Luca Maria Aiello
Ke Zhou
Sagar Joglekar
Daniele Quercia
+ The Healthy States of America: Creating a Health Taxonomy with Social Media 2021 Sanja Šćepanović
Luca Maria Aiello
Ke Zhou
Sagar Joglekar
Daniele Quercia
+ Identifying Protective Health Behaviors on Twitter: Observational Study of Travel Advisories and Zika Virus (Preprint) 2018 Ashlynn R. Daughton
Michael J. Paul
+ PDF Chat The early bird catches the term: combining twitter and news data for event detection and situational awareness 2016 Nicholas Thapen
Donal Simmie
Chris Hankin
+ Catching Zika Fever: Application of Crowdsourcing and Machine Learning for Tracking Health Misinformation on Twitter 2017 Amira Ghenai
Yelena Mejova
+ Catching Zika Fever: Application of Crowdsourcing and Machine Learning for Tracking Health Misinformation on Twitter 2017 Amira Ghenai
Yelena Mejova
+ Case Study on Detecting COVID-19 Health-Related Misinformation in Social Media 2021 Mir Mehedi Ahsan Pritom
Rosana Montañez Rodriguez
Asad Ali Khan
Sebastian A. Nugroho
Esra'a Alrashydah
Beatrice N. Ruiz
Anthony Rios

Works That Cite This (17)

Action Title Year Authors
+ Modeling the impact of lifestyle on health at scale 2013 Adam Sadilek
Henry Kautz
+ An Analytics Framework to Support Surge Capacity Planning for Emerging Epidemics 2016 Martina Curran
Enda Howley
Jim Duggan
+ SimNest: Social Media Nested Epidemic Simulation via Online Semi-Supervised Deep Learning 2015 Liang Zhao
Jiangzhuo Chen
Feng Chen
Wei Wang
Chang‐Tien Lu
Naren Ramakrishnan
+ Online flu epidemiological deep modeling on disease contact network 2019 Liang Zhao
Jiangzhuo Chen
Feng Chen
Fang Jin
Wei Wang
Chang‐Tien Lu
Naren Ramakrishnan
+ PDF Chat Using Natural Language Processing to Extract Health-Related Causality from Twitter Messages 2018 Son Doan
Elly W. Yang
Sameer Tilak
Manabu Torii
+ PDF Chat Disease surveillance based on Internet-based linear models: an Australian case study of previously unmodeled infection diseases 2016 Florian Rohart
Gabriel Milinovich
Simon M R Avril
Kim‐Anh Lê Cao
Shilu Tong
Wenbiao Hu
+ Syndromic classification of Twitter messages 2011 Nigel Collier
Son Doan
+ PDF Chat Role of Participatory Health Informatics in Detecting and Managing Pandemics: Literature Review 2021 Elia Gabarrón
Octavio Rivera-Romero
Talya Miron‐Shatz
Rebecca Grainger
Kerstin Denecke
+ Feature Studies to Inform the Classification of Depressive Symptoms from Twitter Data for Population Health 2017 Danielle L. Mowery
Craig J. Bryan
Mike Conway
+ Identifying Purpose Behind Electoral Tweets 2013 Saif M. Mohammad
Svetlana Kiritchenko
Joel Martin