Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2005, HAL (Le Centre pour la Communication Scientifique Directe)
…
6 pages
1 file
In this paper we present a method for accurate and precise recognition of personal names implemented for Serbian. It is based on development of comprehensive e-dictionaries of Serbian personal names, as well as foreign personal names transcribed to Serbian. In order to obtain high precision, the set of finite state automata (FSA) were developed to model various constraints. The same automata are also used to extract from a text personal names not yet covered by e-dictionaries.
2011
In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision.
Proceedings - Natural Language Processing in a Deep Learning World, 2019
In this paper we present a rule-and lexicon-based system for the recognition of Named Entities (NE) in Serbian newspaper texts that was used to prepare a gold standard annotated with personal names. It was further used to prepare training sets for four different levels of annotation, which were further used to train two Named Entity Recognition (NER) systems: Stanford and spaCy. All obtained models, together with a rule-and lexiconbased system were evaluated on two sample texts: a part of the gold standard and an independent newspaper text of approximately the same size. The results show that rule-and lexicon-based system outperforms trained models in all four scenarios (measured by F 1), while Stanford models have the highest recall. The produced models are incorporated into a Web platform NER&Beyond that provides various NE-related functions.
Theory and Applications of Natural Language Processing, 2012
Lecture Notes in Computer Science, 2002
This article describes a finite-state cascade for the extraction of person names in texts in French. We extract these proper names in order to categorize and to cluster texts with them. After a finite-state pre-processing (division of the text in sentences, tagging with dictionaries, etc.), a series of finite-state transducers is applied one after the other to the text and locates left and right contexts that indicates the presence of a person name. An evaluation of the results of this extraction is presented.
Springer eBooks, 2013
In the paper we present a customizable and open-source framework for proper names recognition called Liner2. The framework consists of several universal methods for sequence chunking which include: dictionary look-up, pattern matching and statistical processing. The statistical processing is performed using Conditional Random Fields and a rich set of features including morphological, lexical and semantic information. We present an application of the framework to the task of recognition proper names in Polish texts (5 common categories of proper names, i.e. first names, surnames, city names, road names and country names). The Liner2 framework was also used to train an extended model to recognize 56 categories of proper names which was used to bootstrap the manual annotation of KPWr corpus. We also present the CRF-based model integrated with a heterogeneous named entity similarity function. We show that the similarity function added to the best configuration improved the final result for cross-domain evaluation. The last section presents NER-WS-a web service for proper names recognition in Polish texts utilizing the Liner2 framework and the model for 56 categories of proper names. The web service can be tested using a web-based demo available at http://nlp.pwr.wroc.pl/inforex/.
2007
The paper is dedicated to the problem of automatic cross-language transcription of proper names. The correct written transcription of foreign proper names is a serious communication problem. It is especially important for legal translation of documents, data retrieval, postal processing and, in general, in all fields, where the accurate identification of places, persons and organizations is required. In order to formalize the process of transcription and reduce the number of errors, the automatic rule-based system of transcription has been developed. The system transcribes proper names between more than 20 languages, including non-European ones. The phonetic approach provides the easy integration of new languages in the system. The results of long-term collaboration of linguists and programmers had been generalized in the monograph.
students.info.uaic.ro
This paper presents a Named Entity Recognition system for Romanian, created using linguistic grammar-based techniques and a set of resources. Our system's architecture is based on two modules, the named entity identification and the named entity classification module. After the named entity candidates are marked for each input text, each candidate is classified into one of the considered categories, such as Person, Organization, Place, Country, etc. The system's Upper Bound and its performance in real context are evaluated for each of the two modules (identification and classification) and for each named entity type.
This paper presents a Named Entity Recognition system for Romanian, created using linguistic grammar-based techniques and a set of resources. Our system's architecture is based on two modules, the named entity identification and the named entity classification module. After the named entity candidates are marked for each input text, each candidate is classified into one of the considered categories, such as Person, Organization, Place, Country, etc. The system's Upper Bound and its performance in real context are evaluated for each of the two modules (identification and classification) and for each named entity type.
European Journal of Electrical Engineering and Computer Science
Named Entity Recognition (NER) is a computational linguistic concept that is used to find and classify appropriate nouns in a text such as person names, geographical locations, and organizations. Such a concept is fundamental in the field of natural language processing. In Libya, many private and public institutions suffer from using the proper translation of entity names from Arabic language into English. Therefore, in this paper, we are concerned with analyzing Arabic articles to extract and recognize entity names. A recognition system is developed for recognizing names of persons, academic institutions, and cities in Libya. At first, a training corpus and dictionaries are built for the intended entity names in this research. Then, the aspects of the entity names are studied, and their patterns and rules are designed. Then, the implementation is performed using Nooj linguistic language. The recognition of person names and Libyan cities and academic institutions was carried out. St...
In the current study at this paper, the different approaches of developing one NER system is discussed .This paper discuss related work about Name Entity Recognition System in English Language. A little database collection of 200 sentences contains 3080 words. The features selection and generations are suggested to capture the Name Entity. The proposed work is expected to predict the Name Entity of the focus words in sentence with high accuracy with the help of the suitable knowledge acquisition techniques.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
IEEE Transactions on Systems, Man, and Cybernetics, 1991
Arxiv preprint cs/ …, 2006
Proceedings of the Workshop on Balto-Slavonic Natural Language Processing Information Extraction and Enabling Technologies - ACL '07, 2007
Journal of Data and Information Quality, 2012
Developing Rule-Based and Gazetteer Lists for Named Entity Recognition in Uzbek Language: Geographical Names, 2023
ACL 2007 Workshop on Computational Approaches to Semitic Languages: Common Issues and Resources, 2007