Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2013
A new challenge is added to the Natural Language Processing Community; how to analyze the new documents forms resulting from the Web 2.0? We are interested in a particular kind of information which is events. Thus, we propose a generic approach to extract and analyze events from text. We propose an event extraction algorithm with a polynomial complexity O(n). This algorithm is based on developed semantic map of events. We validate the first component of our approach by the development of the "EventEC" system.
Extraction and representation of events plays an important role in solving many natural Language processing applications, namely questioning answering system, named entity Recognition, text summarization etc. Events are defined as happening or Situations that occur in the real world. Several methods were defined to annotate the events manually. This paper aim to provide a framework that automatically extract and represent the events that occur in the natural language text. Experiments were conducted on TIME BANK Corpus which consist of nes articles. Most of the events were extracted by our method when compared with other events extraction methods, the results of our method were found to be encouraging.
The Journal of Computer Science and Its Application, 2019
Many text mining techniques have been proposed for mining useful patterns in text documents. However, how to effectively extract and use attributes from unstructured data is still an open research issue. Event attribute extraction is a challenging research area with broad application in the field of data mining and other related field because of the importance of decision making from the hidden knowledge/patterns discovered from the textual data, for example, in crime detection: where events are extracted from an eyewitness report to concisely identify what happened during a crime. In this work, we present our approach to extracting these events based on the dependency parse tree relations of the text and its part of speech (POS). The proposed method uses a machine learning algorithm to predict events from a text. The preliminary result of the experiment run with WEKA tool shows that more than 90% of events can be predicted based on POS and the dependency relations (DepR) of a sentence.
IEEE International Conference on Systems, Man and Cybernetics, 2002
Carthage-Présidence, Tunisie [email protected] 1 Authors' names order is unimportant. This paper is the result of genuine collaborative work between both authors.
SSRN Electronic Journal
Data is published on the web over time in great volumes, but majority of the data is unstructured, making it hard to understand and difficult to interpret. Information Extraction (IE) methods obtain structured information from unstructured data. One of the challenging IE tasks is Event Extraction (EE) which seeks to derive information about specific incidents and their actors from the text. EE is useful in many domains such as building a knowledge base, information retrieval, summarization and online monitoring systems. In the past decades, some event ontologies like ACE, CAMEO and ICEWS were developed to define event forms, actors and dimensions of events observed in the text. These event ontologies still have some shortcomings such as covering only a few topics like political events, having inflexible structure in defining argument roles, lack of analytical dimensions, and insufficient gold-standard data especially in lowresource language such as Persian. To address these concerns, we propose an event ontology, namely COfEE, that incorporates both expert domain knowledge, previous ontologies and a data-driven approach for identifying events from text. COfEE consists of two hierarchy levels (event types and event sub-types) that include new categories relating to environmental issues, cyberspace, criminal activity and natural disasters which need to be monitored instantly. Also, dynamic roles according to each event sub-type are defined to capture various dimensions of events. In a follow-up experiment, the proposed ontology is evaluated on Wikipedia events, and it is shown to be general and comprehensive. Moreover, in order to facilitate the preparation of gold-standard data for event extraction, a language-independent online tool is presented based on COfEE. A gold-standard dataset annotated by 10 human experts is also prepared consisting 24K news articles in Persian language according to the COfEE ontology. In order to diversify the data, news articles from the Wikipedia event portal and 100 most popular Persian news agencies between the years 2008 and 2021 is collected. Finally, we present a supervised method based on deep learning techniques to automatically extract relevant events and corresponding actors.
A powerful tool for planning and announcement of Events is Email. Automatic detection of the Occurrence (Title) and its contextual information (Location, Temporal information, Participants) associated with the email is surely desirable to help the users manage and plan important Events. A lot of work has been done in the area of Event detection but it has various limitations from different perspectives. Firstly, the existing work mainly targets text streams like news stories, scientific documents, articles etc that are somewhat structured documents with sufficient event description as compare to the Emails that have structured, semi-structured and unstructured short descriptions with a plenty of description styles. Secondly the objective in most of the research is to detect new or hot events. Thirdly, much of the existing work aims on reporting events and our objective is to support Event Planning and Management. Another lacking thing is the use of publication time as the temporal information instead of actual temporal information contained within text that is indeed required for Event planning and management task. We have used Finite State Automata (FSA) to extract phrases revealing the Places, temporal information and the actual occurrence. The results are evaluated using different measures. Experiments show that the proposed approach performed well on the Email data Corpus.
2017
In this paper we describe the IIT KGP team’s participation in the Event Extraction task at FIRE 2017. We have developed an event extraction system which can extract event-phrases from tweets written in Indian language scripts along with Roman script. We designed our system on Hindi language and then used the same system for Malayalam and Tamil languages. We have submitted two systems one uses pipelined architecture another uses non-pipelined architecture. In case of pipelined architecture we first identify the tweets which contain event inside it and then extract the eventphrase from those tweets. In case of non-pipelined system all the tweets are directly pass to the event extraction system. Though conceptually simple, non-pipelined approach gives better result than pipelined approach and achieves F1-score of 50.01, 48.29 and 51.80 on Hindi, Malayalam and Tamil dataset respectively.
Twenty Third International Flairs Conference, 2010
Event extraction is a significant task in information extraction. This importance increases more and more with the explosion of textual data available on the Web, the appearance of Web 2.0 and the tendency towards the Semantic Web. Thus, we propose a generic approach to extract events from text and to analyze them. We propose an event extraction algorithm with a polynomial complexity O(n 5), and a new similarity measurement between events. We use this measurement to gather similar events. We also present a semantic map of events, and we validate the first component of our approach by the development of the "EventEC" system.
FIRE (Working Notes), 2018
Today communication has become very fast and is happening in real time. An event that happens in any part of the world gets communicated in few seconds/minutes to the rest of the world. For example the recent twin bomb blasts in Damascus, Syria was known to the world within few minutes. This event was broadcasted in various media channels.. The penetration of smart phones, tabs etc., has significantly changed the way people communicate. The information about events or happenings in real time is very valuable to the administration for disaster management, crowd control, public alerting. These information which is used in the development of recommender systems adds value for the growth of business enterprises. Thus there is a great need to develop systems which can automatically identify various events such as bomb blasts, floods, cyclone, fires, political events etc., reported in various Newswires, Social Media text. This is the 2nd edition of the track. The first edition of this track was conducted last year at FIRE 2017. In that edition the task was to identify only the event and event span given in the data. Thus further going ahead in this track, along with the identification of event and its span, it is necessary to identify the cause and effects of a given event. The actual real time applications will be benefited only if the full information related to the event is identified. For example for a bomb blast, it will be required to know where it has occurred, when it has occurred, who and what all got effected, what are the causalities etc. In this edition of the track we propose to provide data annotated with the cause and effect details of an event and participants are required to identify these details along with event identification. And as in the last year, the focus is on Indian languages text. This paper presents the overview of the task "Event extraction in Indian languages", a track in FIRE 2018. The task of this track is to extract events and all other associated arguments or information such as locations, cause, and its effects from the text. Though event extraction from Indian language texts is gaining attention among Indian research community, however there is no benchmark data available for testing the systems. Hence we have organized this track in the Forum for Information Retrieval Evaluation (FIRE). The paper describes the corpus created for two Indian languages, viz., Hindi, and Tamil and present the overview of the approaches used by the participants.
2006
Automatic event extraction from fulltext resources is a combination of human language technology (HLT) and semantic web technologies. It can also be done on the base of purely statistical means with minimal linguistic knowledge" 1. This thesis introduces a semi-automated method based on the HLT approach. The method uses an existing information extraction system called ANNIE, A Nearly-New Information Extraction System (developed by Hamish Cunningham, Valentin Tablan, Diana Maynard, Kalina Bontcheva, Marin Dimitrov and others). Further text analysis is supported by WordNet and parsers that help in the automatic extraction of historical events and their relations to objects of the human society. Although the method is developed for fulltext resources in the field of history, it is anticipated that it shall also be applied to e-resources in other fields for automatic extraction of historical events. The subject of history is well reckoned with its chronological record of true events, leading from the past to the present and even into the future. When used as the name of a field of study, history refers to the study and interpretation of the record of human societies 2. Historical events extraction, therefore involves the identification of past events and their semantic relations to human society.
ArXiv, 2021
Data is published on the web over time in great volumes, but majority of the data is unstructured, making it hard to understand and difficult to interpret. Information Extraction (IE) methods extract structured information from unstructured data. One of the challenging IE tasks is Event Extraction (EE) which seeks to derive information about specific incidents and their actors from the text. EE is useful in many domains such as building a knowledge base, information retrieval, summarization and online monitoring systems. In the past decades, some event ontologies like ACE, CAMEO and ICEWS were developed to define event forms, actors and dimensions of events observed in the text. These event ontologies still have some shortcomings such as covering only a few topics like political events, having inflexible structure in defining argument roles, lack of analytical dimensions, and complexity in choosing event sub-types. To address these concerns, we propose an event ontology, namely COfE...
2014
A vast amount of electronic information is available in the form of documents such as papers, emails, reports, html pages etc. Sifting through such documents can result in very essential information. An automated tool would be of great use, for identifying and extracting this kind of information. This paper presents an automated approach for identifying a set of event patterns called intelligent information from natural language text. Keyword Automation, data mining, events, information extraction, natural language.
Event extraction is a popular and interesting research field in the area of Natural Language Processing (NLP). In this paper, we propose a hybrid approach for event extraction within the TimeML framework. Initially, we develop a machine learning based system based on Conditional Random Field (CRF). But most of the deverbal event nouns are not correctly identified by this machine learning approach. From this observation, we came up with a hybrid approach where we introduce several strategies in conjunction with machine learning. These strategies are based on semantic role-labeling, WordNet and handcrafted rules. Evaluation results on the TempEval-2010 datasets yield the precision, recall and F-measure values of approximately 93.00%, 96.00% and 94.47%, respectively. This is approximately 12% higher F-measure in comparison with the best performing system of SemEval-2010.
We present EEQuest, an application that extracts events from text using natural language processing (nlp) and supervised machine-learning techniques, and provides a system to query events extracted from a text corpus. We provide a use case for the application wherein we extract business-related events from news articles. The extracted events are then categorized based on the business organization/company that they are related to. Finally, the events are added to a knowledge base using which a query system is built. The system can be used to display events related to a particular organization or a group of organizations. Although we are using the system to extract business-related events, the event extraction mechanism can be used in a more general sense with any available textual data, to extract any kind of events that have a structure that can answer the question: Who did what, when and where?
This paper presents a novel contribution of this research which is an automated NLP pipeline for semantic event extraction and annotation (EveSem). The output from this research is an xml annotated semantic events. Temporal interpretation of event is incorporated by using the linguistic elements made available through the use of the tools. A preliminary evaluation showed that EveSem performed equally well as TIPSem in extracting verbal event with a precision of 85.42 and a recall of 89.13. This work can contribute towards automated annotation of semantic event corpus and event timeline construction as future research.
2013
In this paper we describe a semi-automatic approach to generating event extraction patterns for free texts. The algorithm is composed of four steps: we automatically extract possible events from a corpus of free documents, cluster them using dependency-based parse tree paths, validate random samples from each cluster and generate linear patterns using positive event clusters. We compare our algorithm with the system that uses manually created patterns.
2019
Most recently, with the advanced technological facilities, the automated techniques for extraction of event information has got significantly more importance; and stands as one of the most desirable tasks in the social text stream processing. Among the social text streams: email is one of the most broadly used methods for the official announcements. Moreover, emails have a very complex and diverse unbounded layout in all formats of text structures. In this paper, a novel technique is proposed for event extraction from the email text, where the definition that term "event" engages something as an occurrence or happening with specific attributes, such as at a particular location, date and time, involving one or more actors and participants. Existing work on event detection shows that people have partially represented their attributes. Mostly they have worked on the use of publication time as the temporal information instead of actual temporal information contained within the text. In this work, NLP techniques along with handwritten rules, word semantic tools like WordNet, and gazetteer lists are entailed for countering various issues in running text; it includes the requisite demands such as the grammatical structure of the sentence should be correct for revealing the boundary of the accurate phrase. The detailed evaluation of the proposed methodology is done with metrics metricizes like precision, recall, F1-measure. We are hopeful that researchers and professionals all around the worlds will employ the proposed method for event extraction.
2014
In this paper we present a rule-based method of event extraction from the natural language. We use the Stanford dependency parser in order to build a relation graph of elements from input text. This structure along with serialized extraction frames is converted into a set of facts. We describe a process of creation of application of rules, which aims to match elements from the text with corresponding slots in the extraction frames. A possible match is derived by the comparison of verbal phrases from the text with lexicalizations of anchors (constituting the most important part of each frame) stored in an ontology. The rest of the extraction frame is filled with other elements of the dependency graph, with regard to their semantic type (determined by lexicalizations of allowed types defined in frames and ontology) and their grammatical properties. We describe conversions required to create a consistent knowledge base of text phrases, classification of semantic types and instantiated ...
International Journal of Recent Contributions from Engineering, Science & IT (iJES), 2019
Due to the numerous information needs, retrieval of events from a given natural language text is inevitable. In natural language processing (NLP) perspective, "Events" are situations, occurrences, real-world entities or facts. Extraction of events and arranging them on a timeline is helpful in various NLP application like building the summary of news articles, processing health records, and Question Answering System (QA) systems. This paper presents a framework for identifying the events and times from a given document and representing them using a graph data structure. As a result, a graph is derived to show event-time relationships in the given text. Events form the nodes in a graph, and edges represent the temporal relations among the nodes. Time of an event occurrence exists in two forms namely qualitative (like before, after, duringetc) and quantitative (exact time points/periods). To build the event-time-event structure quantitative time is normalized to qualitative...
2011
In this paper, we report on how historical events are extracted from text within the Semantics of History research project. The project aims at the creation of resources for a historical information retrieval system that can handle the time-based dynamics and varying perspectives of Dutch historical archives. The historical event extraction module will be used for museum collections, allowing users to search for exhibits related to particular historical events or actors within time periods and geographic areas, extracted from accompanying text. We present here the methodology and tools used for the purpose of historical event extraction alongside with the first evaluation results.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.