-
Notifications
You must be signed in to change notification settings - Fork 6
Inconsistent lemma row types for Arabic wordnet #46
Copy link
Copy link
Closed
Labels
dataSomething is wrong in the dataSomething is wrong in the data
Milestone
Description
In the Arabic wordnet (wn-data-arb.tab), instead of arb:lemma we get either arb:lemma:brokenplural or arb:lemma:root, however there are a small number of arb:lemma:brokenPlural (note the uppercase P):
grep ':lemma:' wns/arb/wn-data-arb.tab | cut -f2 | sort | uniq -c
2770 arb:lemma:brokenplural
180 arb:lemma:brokenPlural
14683 arb:lemma:rootI think we should normalize brokenPlural to brokenplural.
There is a separate issue where the Arabic wordnet file without diacritics (wn-nodia-arb.tab) only has lemma for that column, not even arb:lemma. I'm not sure what to do with this file.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
dataSomething is wrong in the dataSomething is wrong in the data