Georgiana Marsic

Temporal Processing of News: Annotation of Temporal Expressions, Verbal Events and Temporal Relations

The ability to capture the temporal dimension of a natural language text is essential to many nat... more The ability to capture the temporal dimension of a natural language text is essential to many natural language processing applications, such as Question Answering, Automatic Summarisation, and Information Retrieval. Temporal processing is a field of Computational Linguistics which aims to access this dimension and derive a precise temporal representation of a natural language text by extracting time expressions, events and temporal relations, and then representing them according to a chosen knowledge framework. This thesis focuses on the investigation and understanding of the different ways time is expressed in natural language, on the implementation of a temporal processing system in accordance with the results of this investigation, on the evaluation of the system, and on the extensive analysis of the errors and challenges that appear during system development. The ultimate goal of this research is to develop the ability to automatically annotate temporal expressions, verbal events and temporal relations in a natural language text. Temporal expression annotation involves two stages: temporal expression identification concerned with determining the textual extent of a temporal expression, and temporal expression normalisation which finds the value that the temporal expression designates and represents it using an annotation standard. The research presented in this thesis approaches these tasks with a knowledge-based methodology that tackles temporal expressions according to their semantic classification. Several knowledge sources and normalisation models are experimented with to allow an analysis of their impact on system performance. The annotation of events expressed using either finite or non-finite verbs is addressed with a method that overcomes the drawback of existing methods v which associate an event with the class that is most frequently assigned to it in a corpus and are limited in coverage by the small number of events present in the corpus. This limitation is overcome in this research by annotating each WordNet verb with an event class that best characterises that verb. This thesis also describes an original methodology for the identification of temporal relations that hold among events and temporal expressions. The method relies on sentence-level syntactic trees and a propagation of temporal relations between syntactic constituents, by analysing syntactic and lexical properties of the constituents and of the relations between them. The detailed evaluation and error analysis of the methods proposed for solving different temporal processing tasks form an important part of this research. Various corpora widely used by researchers studying different temporal phenomena are employed in the evaluation, thus enabling comparison with state of the art in the field. The detailed error analysis targeting each temporal processing task helps identify not only problems of the implemented methods, but also reliability problems of the annotated resources, and encourages potential reexaminations of some temporal processing tasks. vi The completion of my doctoral studies has been the most significant academic challenge I was ever confronted with. It has been a long journey whose course was sometimes deterred by life getting in the way, but to whose successful completion many people have contributed directly or indirectly, and I would like to take this opportunity to thank them. First of all, I would like to thank my supervisory team, Ruslan Mitkov, John Prager and Constantin Orȃsan, for their trust, encouragement, patience and guidance. I am thankful to Ruslan Mitkov, my director of studies, for making this thesis possible by providing the necessary infrastructure and resources to accomplish my research work, and for acting as my supervisor despite his many other academic and professional commitments. I would like to thank John Prager for finding the time to read my thesis and to provide insightful and creative comments. I am extremely indebted to Constantin Orȃsan, my supervisor, colleague and friend, for always showing a sincere interest in my work, for his constructive criticism, for the extensive discussions concerning my work, and for all the help he has given me throughout my years in Wolverhampton. I would like to express my special gratitude and appreciation to my former research advisor Dan Cristea who introduced me to the world of Natural Language Processing. I still think fondly of my time as a postgraduate student that I have spent working with him. I am privileged for having had Verginica Barbu Mititelu, Iustin Dornescu,

Related Authors

Uploads

Papers by Georgiana Marsic

Log In