Academia.eduAcademia.edu

Multi-Tweet Summarization for Flu Outbreak Detection

2012

Abstract

Twitter provides the freshest source of data about what is happening in the lives people across the world. The publicly available streams of status updates available on Twitter have been used to track earthquakes, forest fires and most especially flu outbreaks. Current techniques for tracking flu outbreaks rely on count data for a number of keywords. However, count data alone on the noisy Twitter streams is not reliable enough for health officials to make critical decisions. We propose a semi-automatic outbreak detection system. Rather than providing only alarms backed by count data, we propose a summarization system that will allow health officials to quickly verify outbreak alarms. This will lead to higher levels of trust in the system and allow the system to be used by health organizations around the world. We experimentally verify our summarization system and have found system users to have an accuracy of 0.86 when identifying multitweet summaries.