Skip to content

Unified Semantic Role Labels for UD Datasets #344

@alanakbik

Description

@alanakbik

Hello all,

at IBM Research, we have been working on a layer of unified semantic annotations for a range of languages. We use a data-driven approach in which we re-use existing English Proposition Bank frame and role labels for new target languages, followed by a process of manual curation (ACL 2015, ACL 2016, EMNLP 2016).

For instance, consider the German sentence "Seine Arbeit wird von ehrenamtlichen Helfern und Regionalgruppen des Vereins unterstützt" (His work is supported by volunteers and regional groupings of the association). In CoNLL format, it looks like this, with English PropBank labels in the last two columns:

Id Form POS HeadId Deprel Frame Role
1 Seine DET 2 det:poss _ _
2 Arbeit NOUN 11 nsubjpass _ A1
3 wird AUX 11 auxpass _ _
4 von ADP 6 case _ _
5 ehrenamtlichen ADJ 6 amod _ _
6 Helfern NOUN 11 nmod _ A0
7 und CONJ 6 cc _ _
8 Regionalgruppen NOUN 6 conj _ _
9 des DET 10 det _ _
10 Vereins NOUN 8 nmod _ _
11 unterstützt VERB 0 root support.01 _
12 . PUNCT 11 punct _ _

The German verb 'unterstützt' is labeled as evoking the 'support.01' frame with two roles: "Seine Arbeit" (his work) is labeled A1 (project being supported) and "ehrenamtlichen Helfern und Regionalgruppen des Vereins" (volunteers and regional groupings of the association) is labeled A0 (the helper).

With such data, we can create SRL systems that predict English PropBank labels for many different languages. See a recent demo screencast of this SRL for English, French and German here.

Contribute to UD?

We are now looking into releasing parts of this data to the research community. In particular, we are thinking of contributing this layer of annotation to the universal dependencies data sets (the sentence above is from the German UD dataset).

For this, we would like to know 1) if there is interest from your side to include such labels into the data sets and 2) if so, how such a contribution could be organized. Please let us know your thoughts on this!

Cheers,
Alan

__
Alan Akbik
IBM Research Almaden
http://alanakbik.github.io/

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions