-
Notifications
You must be signed in to change notification settings - Fork 265
Open
Milestone
Description
The Norwegian Bokmål part of UD is based on Norwegian Dependency Treebank (NDT). NDT has now been extended with named entity annotations, and is going to be redistributed with these annotations in addition to the linguistic (syntactic and morphological) annotations.
We are interested in distributing these names as part of the UD as well, if this is desirable.
Questions:
- Do you want to add named entity annotations to UD?
- If you are ok with adding NE annotations (1.), should they be included in the MISC field?
- If they should be added to the MISC field (2.), is there a preferred attribute name? In the paragraph about "Other Miscellaneous Attributes". If there are any other treebanks in UD that already has NE annotations, I assume it is preferred to use the same attribute name. If not, then I'll let you decide on a name.
- Do you prefer any particular convention for annotation entity boundaries/scope? Currently, the annotations are on the IOB2 format, but this can be changed to e.g. using token indexes or something else if desired.
Metadata
Metadata
Assignees
Labels
No labels