Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2006
…
3 pages
1 file
This paper reports the current Portuguese WordNet (WordNet.PT) research and development directions, which mainly regard the enrichment of the WordNet model with event and argument structures (section 1), the codification of cross-part-of speech relations (section 2) and the exploitation of WordNet.PT in concrete applications (section 3).
2016
Semantic relations between words are key to building systems that aim to understand and manipulate language. For English, the “de facto” standard for representing this kind of knowledge is Princeton’s WordNet. Here, we describe the wordnet-like resources currently available for Portuguese: their origins, methods of creation, sizes, and usage restrictions. We start tackling the problem of comparing them, but only in quantitative terms. Finally, we sketch ideas for potential collaboration between some of the projects that produce Portuguese wordnets.
2017
This paper presents two experiments with real world applications of word sense disambiguation, wordnets and dependency parsing. The first is an effort towards a portuguese wordnet annotated corpus. We manually annotated 30 sentences using OpenWordNet-PT as a lexicon and then compared the results with an automatic annotation. In addition to the system’s evaluation, the results provided valuable insights about how to deal with such an ambitious task. The second experiment deals with using Princeton Wordnet as part of an NLP pipeline for information extraction from technical texts in the mining domain and the issues found while integrating word sense disambiguation with a syntactic analysis of the sentences.
In this paper we describe the process used in the production of a lexical database in Brazilian Portuguese with structure compatible with original Princeton WordNet 3.0. This result was reached with a tool built to automatic translate the Princeton WordNet using Google's online translation resources. Here we present also a survey on the main work already done in the construction of Portuguese resources using other techniques as well tools in other languages using machine translation. In conclusion we make a comparison with the other resources in Portuguese and we analyze the results.
2020
The objective of the present paper is twofold, to present the MWN.PT WordNet and to report on its construction and on the lessons learned with it. The MWN.PT WordNet for Portuguese includes 41,000 concepts, expressed by 38,000 lexical units. Its synsets were manually validated and are linked to semantically equivalent synsets of the Princeton WordNet of English, and thus transitively to the many wordnets for other languages that are also linked to this English wordnet. To the best of our knowledge, it is the largest high quality, manually validated and cross-lingually integrated, wordnet of Portuguese distributed for reuse. Its construction was initiated more than one decade ago and its description is published for the first time in the present paper. It follows a three step methodology consisting on the manual validation and expansion of the outcome of an automatic projection procedure of synsets and their hypernym relations, followed by another automatic procedure that transferred...
Language Resources and Evaluation, 2013
A wordnet is an important tool for developing natural language processing applications for a language. However, most wordnets are handcrafted by experts, which limits their growth. In this article, we propose an automatic approach to create wordnets by exploiting textual resources, dubbed ECO. After extracting semantic relation instances, identified by discriminating textual patterns, ECO discovers synonymy clusters, used as synsets, and attaches the remaining relations to suitable synsets. Besides introducing each step of ECO, we report on how it was implemented to create Onto.PT, a public lexical ontology for Portuguese. Onto.PT is the result of the automatic exploitation of Portuguese dictionaries and thesauri, and it aims to minimise the main limitations of existing Portuguese lexical knowledge bases.
… of the Seventeenth International Congress of …, 2003
Abstract: This paper discusses particular linguistic challenges in the task of compiling the Brazilian Portuguese Wordnet, the Wordnet. Br. After setting the scene by overviewing methodological issues, it focuses on the basic steps taken to compile the Wordnet. Br core ...
2011
This paper reports the results of the WordNet.PT global project, an extension of WordNet.PT to all Portuguese varieties. Profiting from a theoretical model of high level explanatory adequacy and from a convenient and flexible development tool, WordNet.PT global achieves a rich and multipurpose lexical resource, suitable for contrastive studies and for a vast range of language-based applications covering all Portuguese varieties.
This document describes the current state of Onto.PT, a new large wordnet for Portuguese, freely available, and created automatically after exploiting and integrating existing lexical resources in a wordnet structure. Besides an overview on Onto.PT, its creation and evaluation, we enumerate the developments of version 0.6. Moreover, we provide a quantitative view on this version, its comparison to other Portuguese wordnets, in terms of contents and size, as well as some details about its global coverage and availability.
… Linguistics and Intelligent …, 2006
Wordnets are electronic databases developed along with the same general lines of the so-called Princeton WordNet, an electronic database of English [1, 2] con-taining nouns, verbs, adjectives, and adverbs. This database is structured as a network of relations between ...
Int. J. Comput. Linguistics Appl., 2013
This paper reports research developed in the scope of building a wordnet for Portuguese (WordNet.PT), particularly f ocusing on the impact the results obtained have in the dens ity of the network of relations and, thus, on its usability for NLP tasks. Following from basic research on different linguist ic phenomena and on strategies for modeling them in relationa l models of the lexicon, the implementation of these results am ounts to a richer resource, with new cross-PoS relations and inf ormation on event and argument structures, thus crucially co ntributing to accurately modeling all the main PoS in the databas e. We also define a way to integrate prepositions in wordnets a nd discuss the motivations and modeling strategies used to do so. Based on this work, we show how our contributions augment the c overage and the accuracy of WordNet.PT, by increasing th e density of the network of relations, thus making it more usa ble for NLP applications.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Lecture Notes in Computer Science, 2010
Arxiv preprint cmp-lg/ …, 1998
Lecture Notes in Computer Science, 2006