Academia.eduAcademia.edu

Term Recognition by Using Different Field Corpora

1999, NTCIR

Abstract

We participated in the term recognition task, one of the subtasks covered by the NTCIR tmrec group. In this paper, we present a system used in this task and evaluate the term recognition results of this system. We believe that terms could be words that characterize the eld's data and have the following three features: (1) They frequently appear in the target eld's corpus. (2) They are not common terms in the target eld. (3) They less frequently appear in the other elds' corpora. Our system uses dierent eld corpora and recognizes these features as terms. We extracted a term list by using two kinds of eld corpora, the NACSIS Academic Conference Database and the MAINICHI newspaper database. We then analyzed the dierence between our term list and Manual-Candidates made by the NTCIR tmrec group. In this paper, we clarify what should be considered when recognizing terms. Furthermore, through comparative experiments based on Manual-Candidates, we verify the importance of indices which are used to extract a term list.