Skip to content

Misapplied exceptional forms? #43

@goodmami

Description

@goodmami

This is for the conversion of Princeton WordNet files for the English omw-en* wordnets.

As noted in this thread goodmami/wn#199 (comment), the exceptional forms (in noun.exc, verb.exc, etc.) are perhaps over-applied. The issue is that the lemma form was matched against case-normalized lemmas, leading to situations like this:

noun.exc

buffaloes buffalo

omw-en31.xml

    <!-- This entry is for the city of Buffalo -->
    <LexicalEntry id="omw-wn31-Buffalo-n">
      <Lemma writtenForm="Buffalo" partOfSpeech="n" />
      <Form writtenForm="buffaloes" />
      <Sense id="omw-wn31-Buffalo-09141172-n" synset="omw-wn31-09141172-n" dc:identifier="buffalo%1:15:00::" />
    </LexicalEntry>

    <!-- This entry is for the meat and the animal -->
    <LexicalEntry id="omw-wn31-buffalo-n">
      <Lemma writtenForm="buffalo" partOfSpeech="n" />
      <Form writtenForm="buffaloes" />
      <Sense id="omw-wn31-buffalo-02413348-n" synset="omw-wn31-02413348-n" dc:identifier="buffalo%1:05:02::">
        <Count>3</Count>
      </Sense>
      <Sense id="omw-wn31-buffalo-07679237-n" synset="omw-wn31-07679237-n" dc:identifier="buffalo%1:13:00::" />
      <Sense id="omw-wn31-buffalo-02410605-n" synset="omw-wn31-02410605-n" dc:identifier="buffalo%1:05:01::" />
    </LexicalEntry>

My guess is we don't actually want the exceptional plural for the proper name entry. So I propose doing a case-sensitive match for lemmas of the exceptional forms.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions