Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1998, System
Reference: LEFFA, Vilson Jose. Textual Constraints In L2 Lexical Disambiguation. System, England, v. 26, n. 2, p. 183-194, 1998.
Revista Electrónica de Lingüística Aplicada, 2022
Babelfy is an online tool, developed in the context of Natural Language Processing. When an item with more than one meaning is introduced in Babelfy, it chooses the appropriate meaning considering the context. The objective of this research study is to test the Word Sense Disambiguation skills of Babelfy in Spanish from a linguistic approach. To do so, a descriptive and comparative study between Babelfy and native Spanish speakers was carried out. Twenty-two pairs of sentences with an ambiguous word were designed, the first sentence of the pair had a neutral context and the second one a facilitating context. These sentence-pairs were introduced in Babelfy to check which meaning of the ambiguous word was selected and to explore whether there were differences depending on the type of context. These results were then compared to the answers of sixty-two Spanish native speakers. The data show that the behaviour of speakers when encountering an ambiguous word is not equivalent to the way Babelfy performs Word Sense Disambiguation, especially when the context is neutral, and the word has related meanings.
Proceedings of the Conference on Recent Advances in …, 2003
This paper describes an unsupervised approach for natural language disambiguation, applicable to ambiguity problems where classes of equivalence can be defined over the set of words in a lexicon. Lexical knowledge is induced from non-ambiguous words via classes of equivalence, and enables the automatic generation of annotated corpora. The only requirements are a lexicon and a raw textual corpus. The method was tested on two natural language ambiguity tasks in several languages: part of speech tagging (English, Swedish, Chinese), and word sense disambiguation (English, Romanian). Classifiers trained on automatically constructed corpora were found to have a performance comparable with classifiers that learn from expensive manually annotated data.
2017
espanolLa desambiguacion del sentido de las palabras se define como el proceso de identificacion del sentido que adopta una palabra polisemica, es decir, con varios significados posibles, en el contexto concreto de una oracion. Debido a la necesidad de definir sin ambiguedad posible el significado de todas las palabras de un texto para que un sistema automatico pueda entenderlo y trabajar con el, la desambiguacion semantica representa un aspecto crucial y transversal a cualquier tarea dentro del Procesamiento del Lenguaje Natural. La investigacion realizada en esta tesis doctoral se centra en la desambiguacion semantica en escenarios en los que existe la posibilidad de utilizar textos escritos en diversos idiomas. Dentro de estos escenarios, dividimos la tesis en dos grandes campos, en funcion de las tareas especificas de desambiguacion a las que nos enfrentamos: desambiguacion bilingue del sentido de las palabras, y desambiguacion multilingue en el dominio biomedico. En la primera ...
2004
Lexical ambiguity is common to all human languages. Indeed it is a fundamental defining characteristic of a human language: a relatively small and finite set of words is used to denote a potentially infinite space of meaning. And so we find that many words are open to different semantic interpretations depending on the context. These
2010
We describe a project to assign the collocations for a word, as automatically identified from a large corpus, to its distinct senses. The goal is, in the short term, a new English collocations dictionary (Macmillan Collocation Dictionary, MCD) and in the long term, a ‘disambiguating dictionary’ supporting a range of language tasks. The project is fully corpus-based, and the core infrastructure is the Sketch Engine corpus query system. We have explored both automatic methods using Word Sense Disambiguation algorithms and a computer-supported approach which integrates two new technologies: GDEX (Good Dictionary Example finding) and TBL (TickBox Lexicography). As at summer 2009, the lexicography for MCD is proceeding apace using both of these innovative methods. 1. THE DREAM OF THE DISAMBIGUATING DICTIONARY It is now commonplace to link a dictionary to electronic texts (in word processors, web browsers or other tools) so that, by clicking or hovering over a word, the user can see the e...
This paper outlines an experiment aimed at assessing the effects that tagging of multi-words in a text has on text disambiguation. The experiment was performed on the Serbian translation of Verne's novel 'Around the world in 80 days', and consisted of two steps: in the first step we applied only resources for single words, while in the second step we included available resources for multi-word tagging. We have assessed the effects of using the multi-word resources through several measures pertaining to overall ambiguity, ambiguity of lemmas and ambiguity of grammatical categories. The results confirmed that the tagging with multi-word units reduces the ambiguity of a text, which was to be expected, but also showed that despite considerable benefit obtained in specific cases the overall reduction of ambiguity was not substantial, at least for the given example and the available resources. We further analyze the possible reasons for such results and ways of to improve them.
2001
We apply word sense disambiguation to the definitions in a Spanish explanatory dictionary. To calculate the scores of word senses basing on the context (which in our case is the dictionary definition), we use a modification of Lesk's algorithm. The algorithm relies on a comparison between two words.
Lingvisticae Investigationes, 2001
We examine various issues faced during the elaboration of lexical disambiguators, e.g. issues related with linguistic analyses underlying disambiguators, and we exemplify these issues with grammatical constraints. We also examine computational problems and show how they are connected with linguistic problems: the influence of the granularity of tagsets, the definition of realistic and useful objectives, and the construction of the data required for the reduction of ambiguity. We show why a formalism is required for automatic ambiguity reduction, we analyse its function and we present a typology of such formalisms.
The paper is an attempt to utilize the componential analysis theory, the distinctive feature matrices and the conceptual frames as tools for representing human knowledge. It also presents an automatic disambiguation system (ADS) that accounts for textual ambiguities at all the linguistic levels. The traditional methods of tackling ambiguity are not adequate since they specify a few constrains assisted by clues derived from the textual information. The proposed ADS makes extensive use of the linguistic constrains LC's in the disambiguation process (DP). It also utilizes the statistical guide which is, in essence, a combination of the traditional method and the recent statistical methods. It has been found out that this intermarriage between the old (ruled base) and the new (statistical) can help solving a lot of textual analysis problems such as those related to error detection and knowledge inference. Furthermore, the model can serve other linguistic applications such as translation.
Computational and Corpus-Based Phraseology, 2019
Multi-word terms pose many challenges in Natural Language Processing (NLP) because of their structure ambiguity. Although the structural disambiguation of multi-word expressions, also known as bracketing, has been widely studied, no definitive solution has as yet been found. Although linguists, terminologists, and translators must deal with bracketing problems, they generally must resolve problems without using advanced NLP systems. This paper describes a series of manual steps for the bracketing of multi-word terms (MWTs) based on their linguistic properties and recent advances in NLP. After analyzing 100 three-and four-term combinations, a set of criteria for MWT bracketing was devised and arranged in a step-by-step protocol based on frequency and reliability. Also presented is a case study that illustrates the procedure.
SSRN, 2022
In present era, Natural Language Processing (NLP) is critical for improving human-machine communication. It is a broad interest to process textual data and gathers valuable and exact information from these texts. NLP compiles the text and sends the data to a computer for further processing. The current state of NLP's mathematical model for proper understanding of word meaning is unclear, and the meaning of words in context is unclear, evoking multiple senses. The spread and improvement of Natural Language Processing applications are being hampered by ambiguity in interpreting the precise meaning of texts such as machine translation (MT), Human-Machine interfaces, and so on. The approach of discovering the correct interpretation of ambiguous word in a given sentence is accepted as Word Sense Disambiguation (WSD).WSD is recognized as being one of natural language processing's more challenging and unsolved problems. Many ambiguities in natural languages are apparent, and researchers are offering to solve the problem in a variety of languages to achieve good disambiguation. These ambiguities must be solved in order to make sense including its texts and advance NLP processing and applications. WSD has a number of NLP applications for which it could be a problem, such as Machine Translation (MT), Information Retrieval (IR), Dialogues, Speech Synthesis (SS), and Question Answering (QA). The effectiveness of many strategies directly applied to WSD, such as Dictionary and Knowledge-based, Supervised, Semi-Supervised and Unsupervised approach, is compared in this study.
1997
Abstract Word sense disambiguation has developed as a sub-area of natural language processing, as if, like parsing, it was a well-defined task which was a pre-requisite to a wide range of language-understanding applications. First, I review earlier work which shows that a set of senses for a word is only ever defined relative to a particular human purpose, and that a view of word senses as part of the linguistic furniture lacks theoretical underpinnings.
Proceedings of the second conference on Applied natural language processing -, 1988
2002
Abstract This paper explores the role of domain information in word sense disambiguation. The underlying hypothesis is that domain labels, such as MEDICINE, ARCHITECTURE and SPORT, provide a useful way to establish semantic relations among word senses, which can be profitably used during the disambiguation process. Results obtained at the SENSEVAL-2 initiative confirm that for a significant subset of words domain information can be used to disambiguate with a very high level of precision.
1995
We present an approach to lexical ambiguity where regularities about sense/u~ge extensibillty are represented by underepecifying word entries through lexic~d polymorphism. Word diumbiguation is carried out using contextual information gathered during language processing to ground polymorphic lexical entries.
ITAT, 2017
We describe an annotation experiment combining topics from lexicography and Word Sense Disambiguation. It involves a lexicon (Pattern Dictionary of English Verbs, PDEV), an existing data set (VPS-GradeUp), and an unpublished data set (RTE in PDEV Implicatures). The aim of the experiment was twofold: a pilot annotation of Recognizing Textual Entailment (RTE) on PDEV implicatures (lexicon glosses) on the one hand, and, on the other hand, an analysis of the effect of Textual Entailment between lexicon glosses on annotators' Word-Sense-Disambiguation decisions, compared to other predictors, such as finiteness of the target verb, the explicit presence of its relevant arguments, and the semantic distance between corresponding syntactic arguments in two different patterns (dictionary senses).
Proceedings of the 16th conference on Computational linguistics -, 1996
In this paper we sketch a decidable inference-based procedure for lexical disambiguation which operates on semantic representations of discourse and conceptual knowledge, In contrast to other approaches which use a classical logic for the disambiguating inferences and run into decidability problems, we argue on the basis of empirical evidence that the underlying iifference mechanism has to be essentially incomplete in order to be (cognitively) adequate. Since our conceptual knowledge can be represented in a rather restricted representation language, it is then possible to show that the restrictions satisfied by the conceptual knowledge and the inferences ensure in an empirically adequate ww the decidability of the problem, although a fully expressive language is used to represent discourse.
Computational Linguistics, 2001
Word sense disambiguation (WSD) is a computational linguistics task likely to benefit from the tradition of combining different knowledge sources in articial intelligence research. An important step in the exploration of this hypothesis is to determine which linguistic knowledge sources are most useful and whether their combination leads to improved results. We present a sense tagger which uses several knowledge sources. Tested accuracy exceeds 94% on our evaluation corpus. Our system attempts to disambiguate all content words in running text rather than limiting itself to treating a restricted vocabulary of words. It is argued that this approach is more likely to assist the creation of practical systems.
2018
In natural language processing (NLP), word sense disambiguation (WSD) is an automatic process carried out by a machine to sense the appropriate meaning of a word in a particular context or in a discourse. Natural language is ambiguous, so that many words may be interpreted in multiple methods depending on the context wherein they occur. The computational identification of which means for words in context is known as word sense disambiguation (WSD). In this paper, we will discuss the ambiguity of the words in the languages and the essential measures to deal with the ambiguous words.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.