Feature-Rich Twitter Named Entity Recognition and Classification

Björn Gambäck

Feature-Rich Twitter Named Entity Recognition and Classification

Björn Gambäck

2016, International Conference on Computational Linguistics

visibility

…

description

7 pages

link

1 file

Twitter named entity recognition is the process of identifying proper names and classifying them into some predefined labels/categories. The paper introduces a Twitter named entity system using a supervised machine learning approach, namely Conditional Random Fields. A large set of different features was developed and the system was trained using these. The Twitter named entity task can be divided into two parts: i) Named entity extraction from tweets and ii) Twitter name classification into ten different types. For Twitter named entity recognition on unseen test data, our system obtained the second highest F 1 score in the shared task: 63.22%. The system performance on the classification task was worse, with an F 1 measure of 40.06% on unseen test data, which was the fourth best of the ten systems participating in the shared task.

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

minal sonmale

2015

Twitter has allowed millions of users to share and spread most up-to-date information which results into large volume of data generated every day. Due to extremely useful business information obtained from these tweets, it is necessary to understand tweets language for downstream applications, such as Named Entity Recognition (NER). Real time applications like Traffic detection system, Early crisis detection and response with target twitter stream required good NER system, which automatically find emerging named entities that are potentially linked to the crisis and traffic, but tweets are infamous for their error-prone and short nature. This leads to failure of much conventional NER techniques, which heavily depend on local linguistic features, such as capitalization, POS tags of previous words etc. Recently segment-based tweet representation has showed effectiveness in NER.The goal of this survey is to provide a comprehensive review of NER system over twitter data and different NE...

Log In

Feature-Rich Twitter Named Entity Recognition and Classification

Sign up for access to the world's latest research

Related papers

Related papers

Related topics