Skip to content

Adding Named Entity annotations #562

@fredrijo

Description

@fredrijo

The Norwegian Bokmål part of UD is based on Norwegian Dependency Treebank (NDT). NDT has now been extended with named entity annotations, and is going to be redistributed with these annotations in addition to the linguistic (syntactic and morphological) annotations.

We are interested in distributing these names as part of the UD as well, if this is desirable.

Questions:

  1. Do you want to add named entity annotations to UD?
  2. If you are ok with adding NE annotations (1.), should they be included in the MISC field?
  3. If they should be added to the MISC field (2.), is there a preferred attribute name? In the paragraph about "Other Miscellaneous Attributes". If there are any other treebanks in UD that already has NE annotations, I assume it is preferred to use the same attribute name. If not, then I'll let you decide on a name.
  4. Do you prefer any particular convention for annotation entity boundaries/scope? Currently, the annotations are on the IOB2 format, but this can be changed to e.g. using token indexes or something else if desired.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions