Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2002
Abstract This paper explores the contribution of a broad range of syntactic features to WSD: grammatical relations coded as the presence of adjuncts/arguments in isolation or as subcategorization frames, and instantiated grammatical relations between words. We have tested the performance of syntactic features using two different ML algorithms (Decision Lists and AdaBoost) on the Senseval-2 data. Adding syntactic features to a basic set of traditional features improves performance, especially for AdaBoost.
2004
Although syntactic features offer more specific information about the context surrounding a target word in a Word Sense Disambiguation (WSD) task, in general, they have not distinguished themselves much above positional features such as bag-of-words. In this paper we offer two methods for increasing the recall rate when using syntactic features on the WSD task by: 1) using an algorithm for discovering in the corpus every possible syntactic feature involving a target word, and 2) using wildcards in place of the lemmas in the templates of the syntactic features. In the best experimental results on the SENSEVAL-2 data we achieved an Fmeasure of 53.1% which is well above the mean F-measure performance of official SENSEVAL-2 entries, of 44.2%. These results are encouraging considering that only one kind of feature is used and only a simple Support Vector Machine (SVM) running with the defaults is used for the machine learning.
2006
In the Natural Language Processing (NLP) community, Word Sense Disambiguation (WSD) has been described as the task which selects the appropriate meaning (sense) to a given word in a text or discourse where this meaning is distinguishable from other senses potentially attributable to that word. These senses could be seen as the target labels of a classification problem. That is, Machine Learning (ML) seems to be a possible way to tackle this problem.
2004
The success of supervised learning approaches to word sense disambiguation is largely dependent on the features used to represent the context in which an ambiguous word occurs. Previous work has reached mixed conclusions; some suggest that combinations of syntactic and lexical features will perform most effectively. However, others have shown that simple lexical features perform well on their own. This paper evaluates the effect of using different lexical and syntactic features both individually and in combination. We show that it is possible for a very simple ensemble that utilizes a single lexical feature and a sequence of part of speech features to result in disambiguation accuracy that is near state of the art.
Information, 2019
The paper presents a flexible system for extracting features and creating training and testexamples for solving the all-words sense disambiguation (WSD) task. The system allowsintegrating word and sense embeddings as part of an example description. The system possessestwo unique features distinguishing it from all similar WSD systems—the ability to construct aspecial compressed representation for word embeddings and the ability to construct training andtest sets of examples with different data granularity. The first feature allows generation of data setswith quite small dimensionality, which can be used for training highly accurate classifiers ofdifferent types. The second feature allows generating sets of examples that can be used for trainingclassifiers specialized in disambiguating a concrete word, words belonging to the samepart-of-speech (POS) category or all open class words. Intensive experimentation has shown thatclassifiers trained on examples created by the system outperfo...
This paper shows that our WSD system using rich linguistic features achieved high accuracy in the classification of English SENSEVAL2 verbs for both fine-grained (64.6%) and coarse-grained (73.7%) senses. We describe three specific enhancements to our treatment of rich linguistic features and present their separate and combined contributions to our system's performance. Further experiments showed that our system had robust performance on test data without high quality rich features.
The Proceedings of the Second International Workshop on Evaluating Word Sense Disambiguation Systems, 2001
This paper describes a descriptivesemantic-primitive-based method for word sense disambiguation (WSD) with a machine-tractable dictionary and conceptual distance data among primitives. This approach is using unsupervised learning algorithm and focuses only on the immediately surrounding words and basis morphological form to disambiguate a word sense. This approach also agrees with past observations that human only requires a small window of a few words to perform WSD. (Choueka & Lusignan, 1985). In additional, this paper also describes our experience in doing the English all-word task in SENSEV AL-2. Then, we will discuss the results in the SENSEV AL-2 evaluation. Apart from the description of current system, possibilities for future work are explored 1 Primitive-Based Word Sense Disambiguation This system consists of three important components: machine-tractable dictionary, conceptual distance data and sense tagger that uses a simple summation algorithm.
Lecture Notes in Computer Science, 2000
In this article we compare the performance of various machine learning algorithms on the task of constructing word-sense disambiguation rules from data. The distinguishing characteristic of our work from most of the related work in the field is that we aim at the disambiguation of all content words in the text, rather than focussing on a small number of words. In an earlier study we have shown that a decision tree induction algorithm performs well on this task. This study compares decision tree induction with other popular learning methods and discusses their advantages and disadvantages. Our results confirm the good performance of decision tree induction, which outperforms the other algorithms, due to its ability to order the features used for disambiguation, according to their contribution in assigning the correct sense.
Computing Research Repository, 2000
The most effective paradigm for word sense disambiguation, supervised learning, seems to be stuck because of the knowledge acquisition bottleneck. In this paper we take an in-depth study of the performance of decision lists on two publicly available corpora and an additional corpus automatically acquired from the Web, using the fine-grained highly polysemous senses in WordNet. Decision lists are shown a versatile state-of-the-art technique. The experiments reveal, among other facts, that SemCor can be an acceptable (0.7 precision for polysemous words) starting point for an all-words system. The results on the DSO corpus show that for some highly polysemous words 0.7 precision seems to be the current state-of-the-art limit. On the other hand, independently constructed hand-tagged corpora are not mutually useful, and a corpus automatically acquired from the Web is shown to fail.
Congreso de la SEPLN, 2004
Abstract: The paper addresses the issue of how to use linguistic information in Word Sense Disambiguation (WSD). We introduce a knowledge-driven and unsupervised WSD method that requires only a large corpus previously tagged with POS and very little grammatical knowledge. The WSD process is performed taking into account the syntactic patterns in which the ambiguous occurrence appears, relaying in the hypothesis of “almost one sense per syntactic pattern”. This integration allows us to obtain, from corpora, paradigmatic and ...
Abstract Word Sense Disambiguation (WSD) is one of the most important open problems in Natural Language Processing. One of the most successful current lines of research in WSD is the corpus-based approach, in which machine learning algorithms are applied to learn statistical models or classifiers from corpora. When a machine learning approach learns from previously semantically annotated corpora it is said to be supervised, whereas when it does not use sense tagged data during training it is called unsupervised.
In this paper, we describe our experiments on statistical word sense disambiguation (WSD) using two systems based on different approaches: Naïve Bayes on word tokens and Maximum Entropy on local syntactic and semantic features. In the first approach, we consider a context window and a sub-window within it around the word to disambiguate. Within the outside window, only content words are considered, but within the sub-window, all words are taken into account. Both window sizes are tuned by the system for each word to disambiguate and accuracies of 75% and 67% were respectively obtained for coarse and fine grained evaluations. In the second system, sense resolution is done using an approximate syntactic structure as well as semantics of neighboring nouns as features to a Maximum Entropy learner. Accuracies of 70% and 63% were obtained for coarse and fine grained evaluations.
2015
In this paper we present an approach for the enrichment of WSD knowledge bases with data-driven relations from a gold standard corpus (annotated with word senses, valency information, syntactic analyses, etc.). We focus on Bulgarian as a use case, but our approach is scalable to other languages as well. For the purpose of exploring such methods, the Personalized Page Rank algorithm was used. The reported results show that the addition of new knowledge improves the accuracy of WSD with approximately 10.5%.
Journal of Natural Language Processing, 2009
Traditionally, many researchers have addressed word sense disambiguation (WSD) as an independent classification problem for each word in a sentence. However, the problem with their approaches is that they disregard the interdependencies of word senses. Additionally, since they construct an individual sense classifier for each word, their method is limited in its applicability to the word senses for which training instances are served. In this paper, we propose a supervised WSD model based on the syntactic dependencies of word senses. In particular, we assume that strong dependencies between the sense of a syntactic head and those of its dependents exist. We describe these dependencies on the tree-structured conditional random fields (T-CRFs), and obtain the most appropriate assignment of senses optimized over the sentence. Furthermore, we incorporate these sense dependencies in combination with various coarse-grained sense tag sets, which are expected to relieve the data sparseness problem, and enable our model to work even for words that do not appear in the training data. In experiments, we display the appropriateness of considering the syntactic dependencies of senses, as well as the improvements by the use of coarse-grained tag sets. The performance of our model is shown to be comparable to those of state-ofthe-art WSD systems. We also present an in-depth analysis of the effectiveness of the sense dependency features by showing intuitive examples.
Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning -, 2000
This paper describes a set of comparative experiments, including cross-corpus evaluation, between five alternative algorithms for supervised Word Sense Disambiguation (WSD), namely Naive Bayes, Exemplar-based learning, SNoW, Decision Lists, and Boosting. Two main conclusions can be drawn: 1) The LazyBoosting algorithm outperforms the other four state-of-theart algorithms in terms of accuracy and ability to tune to new domains; 2) The domain dependence of WSD systems seems very strong and suggests that some kind of adaptation or tuning is required for cross-corpus application.
Revista Espanola De Linguistica Aplicada, 2009
This paper presents an algorithm based on collocational data for word sense disambiguation (WSD). The aim of this algorithm is to maximize efficiency by minimizing (1) computational costs and (2) linguistic tagging/annotation. The formalization of our WSD algorithm is based on discriminant function analysis (DFA). This statistical technique allows us to parameterize each collocational item with its meaning, using just bare text. The parameterized data allow us to classify cases (sentences with an ambiguous word) into the values of a categorical dependent (each of the meanings of the ambiguous word). To evaluate the validity and efficiency of our WSD algorithm, we previously hand sense-tagged all the sentences containing ambiguous words and then cross-validated the hand sense-tagged data with the automatic WSD performance. Finally, we present the global results of our algorithm after applying it to a limited set of words in both languages: Spanish and English, highlighting the points...
Workshop Programme, 2004
Word Sense Disambiguation confronts with the lack of syntagmatic information associated to word senses: the “gap” between lexicon (here EuroWordNet, EWN) and corpus. In the present work we propose to fill this gap by applying different strategies: from one side, we extract paradigmatic information related to the ambiguous occurrence in a syntactic pattern from corpus and we incorporate it into the WSD process; from the other side, we derive discriminatory sets of senses from EWN for the ambiguous word and so ...
Natural Language Engineering, 2002
Has system performance on Word Sense Disambiguation (WSD) reached a limit? Automatic systems don't perform nearly as well as humans on the task, and from the results of the SENSEVAL exercises, recent improvements in system performance appear negligible or even negative. Still, systems do perform much better than the baselines, so something is being done right. System evaluation is crucial to explain these results and to show the way forward. Indeed, the success of any project in WSD is tied to the evaluation methodology used, and especially to the formalization of the task that the systems perform. The evaluation of WSD has turned out to be as difficult as designing the systems in the first place.
Meeting of the Association for Computational Linguistics, 2004
Supervised learning methods for WSD yield better performance than unsupervised methods. Yet the availability of clean training data for the former is still a severe challenge. In this paper, we present an unsupervised bootstrapping approach for WSD which exploits huge amounts of automatically gen- erated noisy data for training within a supervised learning framework. The method is evaluated using the
2018
The task of Word Sense Disambiguation (WSD) is to determine the meaning of an ambiguous word in a given context. In spite of its importance for most NLP pipelines, WSD can still be seen to be unsolved. The reason is that we currently lack tools for WSD that handle big data – “big” in terms of the number of ambiguous words and in terms of the overall number of senses to be distinguished. This desideratum is exactly the objective of fastSense, an efficient neural network-based tool for word sense disambiguation introduced in this paper. We train and test fastSense by means of the disambiguation pages of the German Wikipedia. In addition, we evaluate fastSense in the context of Senseval and SemEval. By reference to Senseval and SemEval we additionally perform a parameter study. We show that fastSense can process huge amounts of data quickly and also surpasses state-of-the-art tools in terms of F-measure.
2007
We describe two systems participating of the English Lexical Sample task in SemEval-2007. The systems make use of Inductive Logic Programming for supervised learning in two different ways: (a) to build Word Sense Disambiguation (WSD) models from a rich set of background knowledge sources; and (b) to build interesting features from the same knowledge sources, which are then used by a standard model-builder for WSD, namely, Support Vector Machines. Both systems achieved comparable accuracy (0.851 and 0.857), which outperforms considerably the most frequent sense baseline (0.787).