Academia.eduAcademia.edu

Dependency Parsing for Telugu

2013

Abstract

In this paper we present our experiments in parsing for Telugu language. We explore two data driven parsers Malt and MST and compare the results of both the parsers. We describe the data and parser settings used in detail. Some of these are specific to either one particular or all the Indian Languages. The average of best unlabeled attachment, labeled attachment and labeled accuracies are 88.43%, 69.71 % and 70.01 % respectively.We are also presented which parser gives best results for different sentence types in Telugu.

Key takeaways

  • Section 3 describes the data and parser settings for Telugu language.
  • Malt Parser: Malt Parser (Nivre et al., 2006) implements which has two essential components: A transition system for mapping sentences into dependency trees A classifier for predicting the next transition for every possible system configuration Transition Systems: MaltParser comes with a number of built-in transition systems, but we limit our attention to the two systems that have been used in the parsing experiments: the arc-eager projective system first described in Nivre (2003) and the non-projective transition system based on the method described by Covington (2001).
  • This is the parser described in the following papers -Multilingual Dependency Parsing with a Two-Stage Discriminative Parser -Online Learning of Approximate Dependency Parsing Algorithms -Non-projective Dependency Parsing using Spanning Tree Algorithms -Online Large-Margin Training of Dependency Parsers Telugu is a Dravidian language which is agglutinative in nature.
  • For simple sentences, both parsers had given good results, but for other sentence types they have shown less accuracies.
  • For Telugu language, Malt performed better over MST.