Evaluation of machine translation and its evaluation

Joseph Turian

Evaluation of machine translation and its evaluation

Joseph Turian

2006

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Abstract: Evaluation of MT evaluation measures is limited by inconsistent human judgment data. Nonetheless, machine translation can be evaluated using the well-known measures precision, recall, and their average, the F-measure. The unigrambased F-measure has significantly higher correlation with human judgments than recently proposed alternatives. More importantly, this standard measure has an intuitive graphical interpretation, which can facilitate insight into how MT systems might be improved.

Maghi King

2012

The Framework for the Evaluation for Machine Translation (FEMTI) contains guidelines for building a quality model that is used to evaluate MT systems in relation to the purpose and intended context of use of the systems. Contextual quality models can thus be constructed, but entering into FEMTI the knowledge required for this operation is a complex task. An experiment has been set up in order to transfer knowledge from MT evaluation experts into the FEMTI guidelines, by polling experts about the evaluation methods they would use in a particular context, then inferring from the results generic relations between characteristics of the context of use and quality characteristics. The results of this hands-on exercise, carried out as part of a conference tutorial, have served to refine FEMTI’s ‘generic contextual quality model ’ and to obtain feedback on the FEMTI guidelines in general.

Log In

Evaluation of machine translation and its evaluation

Sign up for access to the world's latest research

Abstract

Related papers

Related papers