Catching the Long-Tail: Extracting Local News Events from Twitter

Saurabh Sharma

Catching the Long-Tail: Extracting Local News Events from Twitter

Proceedings of the International AAAI Conference on Web and Social Media

Abstract

Twitter, used in 200 countries with over 250 milliontweets a day, is a rich source of local news from aroundthe world. Many events of local importance are first reportedon Twitter, including many that never reach newschannels. Further, there are often only a few tweetsreporting each such event, in contrast with the largervolumes that follow events of wider significance. Eventhough such events may be primarily of local importance,they can also be of critical interest to some specificbut possibly far flung entities: For example, a firein a supplier’s factory half-way around the world maybe of interest even from afar. In this paper we describehow this ‘long tail’ of events can be detected in spite oftheir sparsity.We then extract and correlate informationfrom multiple tweets describing the same event. Ourgeneric architecture for converting a tweet-stream intoevent-objects uses locality sensitive hashing, classification,boosting, information extraction and clustering.Our results, based ...

From the recent proliferation of social media channels to the immense amount of user-generated content, an increasing interest in social media mining is currently being witnessed. Messages continuously posted via these channels report a broad range of topics from daily life to global and local events. As a consequence, this has opened new opportunities for mining event information crucial in many application domains, especially in increasing the situational awareness in critical scenarios. Interestingly, many of these messages are enriched with location information, due to the widespread of mobile devices and the recent advancements of today's location acquisition techniques. This enables location-aware event mining, i.e., the detection and tracking of localized events. In this thesis, we propose novel frameworks and models that digest social media content for localized event detection, tracking, and recommendation. We first develop KeyPicker, a framework to extract and score event-related keywords in an online fashion, accounting for high levels of noise, temporal heterogeneity and outliers in the data. Then, LocEvent is proposed to incrementally detect and track events using a 4-stage procedure. That is, LocEvent receives the keywords extracted by KeyPicker, identifies local keywords, spatially clusters them, and finally scores the generated clusters. For each detected event, a set of descriptive keywords, a location, and a time interval are estimated at a fine-grained resolution. In addition to the sparsity of geo-tagged messages, people sometimes post about events far away from an event's location. Such spatial problems are handled by novel spatial regularization techniques, namely, graph-and gazetteer-based regularization. To ensure scalability, we utilize a hierarchical spatial index in addition to a multi-stage filtering procedure that gradually suppresses noisy words and considers only event-related ones for complex spatial computations. Undertaking this PhD has been a truly life-changing experience for me. This work would not have been possible without the support that I received from many people. First of all, I would like to express my deepest gratitude to my advisor, Prof. Dr. Michael Gertz, for giving me the opportunity to be one of his PhD students, it is truly an honor. I am really thankful for his guidance, endless support, immense knowledge, and deep insights that helped me at various stages of my research. I remain amazed that despite his busy schedule, he was able to go through the final draft of my thesis and meet me regularly with comments and suggestions on almost every page. I would also like to thank my dissertation committee for the time, efforts, and precious feedback. During my PhD study and in spite of my busy days, I had a memorable time at the Database Systems Research Group, Heidelberg University. I was lucky to have an impressive research environment with great colleagues. Thank you, Dr. Ayser Armiti, you supported my first step to join this wonderful group. Many thanks to

Log In

Catching the Long-Tail: Extracting Local News Events from Twitter

Sign up for access to the world's latest research

Abstract

Related papers

Related topics