Recognition of Personal Names in Serbian Texts

Duško Vitas

Recognition of Personal Names in Serbian Texts

Duško Vitas

2005, HAL (Le Centre pour la Communication Scientifique Directe)

visibility

…

description

6 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this paper we present a method for accurate and precise recognition of personal names implemented for Serbian. It is based on development of comprehensive e-dictionaries of Serbian personal names, as well as foreign personal names transcribed to Serbian. In order to obtain high precision, the set of finite state automata (FSA) were developed to model various constraints. The same automata are also used to extract from a text personal names not yet covered by e-dictionaries.

Duško Vitas

2011

In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision.

Log In

Recognition of Personal Names in Serbian Texts

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics