Skip to content

Releases: omwn/omw-data

v2.0

01 Feb 01:33
f75be19

Choose a tag to compare

Overview

This release of OMW uses the WN-LMF 1.4 schema, which adds an index attribute on LexicalEntry elements and an n attribute on Sense elements (only for WNDB-derived lexicons, namely the English wordnets). These attributes allow one to get a precise ordering of words and senses aligned to what the original Princeton WordNet does for improved reproducibility. Exceptional forms are now only paired with forms matching in case, so, for instance, Buffalo (the city) does not get an alternative form buffaloes.

This release also updates and fixes some issues with the MCR and Arabic wordnets and removes duplicate entries in many wordnets. Finally, this release also includes English wordnets derived from pre-3.0 versions of the Princeton WordNet:

  • WordNet 1.5
  • WordNet 1.6
  • WordNet 1.7
  • WordNet 1.7.1
  • WordNet 2.0
  • WordNet 2.1

See below for a more granular list of changes.

What's Changed

  • Update the MCR wordnets to the 2016 version by @ekaf in #27
  • Fix issues with char escapes by @goodmami in #40
  • Update wndb2lmf to build Pre-3.0 WordNets by @goodmami in #42
  • TSV cleanup scripts by @goodmami in #48
  • Update sum-rel.py to summarize-release.py by @goodmami in #44
  • Remove redundant lemmas found with clean.sh by @goodmami in #50
  • 52 allow the input file to have counts and pronunciation by @fcbond in #53
  • Allow alternative forms for Arabic by @goodmami in #56
  • Handle lexical gaps marked by GAP! or PSEUDOGAP! by @goodmami in #57
  • use more environment variables so the scripts are more portable by @fcbond in #54
  • Gh 24 unexpected identifiers by @goodmami in #59
  • Release 2.0 by @goodmami in #61
  • Release 2.0 Part 2 by @goodmami in #62
  • Bump Wn to 1.0.0rc0, fix validation call, lmf ver by @goodmami in #64

New Contributors

Full Changelog: v1.4...v2.0

OMW DATA 1.4

09 Nov 07:19

Choose a tag to compare

OMW data from version 1.0, with minor fixes in the metadata and labeling, converted to WN-LMF 1.1

This version contains new versions of the English Wordnets, based on the Princeton WordNet of English, with minor fixes (see the README for each wordnet). Notably, we have changed the names to reflect that they are not identical to the Princeton WordNet: OMW English Wordnet based on WordNet 3.0/31.

Note: The .zip and .tar.gz files are not part of this release, and only capture the state of this repository when the release was created. The .tar.xz files (containing WN-LMF XML files) are the contents of the OMW 1.4 release.

OMW 1.3

24 Nov 07:14

Choose a tag to compare

This release is the same as the 1.2 data but converted to WN-LMF 1.0, possibly with some cleanup of project metadata.

NOTE: The data in this release called Princeton WordNet 3.0/3.1 is inappropriately named. As it is an export, and thus a derivation, of the original WordNet data, it should not be called the Princeton WordNet or WordNet and it is not a product of Princeton University. From the next version of the OMW, it is called the OMW English Wordnet based on WordNet 3.0/3.1.

OMW 1.2 Data

06 Nov 09:19

Choose a tag to compare

This is a collection of the data from OMW 1.2, put here for easy access, and to have a permanent home.