Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008
…
8 pages
1 file
This paper describes the building of a valency lexicon of Arabic verbs using a morphologically and syntactically annotated corpus, the Prague Arabic Dependency Treebank, as its primary source. We present the theoretical account on valency developed within the Functional Generative Description theory. We apply the framework to Arabic and discuss various valency-related phenomena with respect to examples from the corpus. We then outline the methodology and the linguistic and technical resources used in the building of the lexicon. Valency lexicons can find application in automatic parsing as well as in language generation.
In this paper, we present a modeling of the syntactic lexicon for Arabic verbs based on the Lexical Markup Framework. This ISO standard let us describe the lexical information in a simple way using general guidelines and enable the sharing of resources following the standard. We discuss the syntactic information associated to verbs and the model we propose to structure and represent the entries within the lexicon. To study the usability of the model, we implemented a rule-based system that translates a LMF syntactic resource into Type Description Language. The generated lexicon is used as input for a previously written HPSG grammar for Arabic built within the Language Knowledge Builder platform. Finally, we discuss improvements in parsing results and possible perspectives of this work.
— Empirical pedagogical dictionaries aim at defining words in their context and presenting corpus-based evidence for each word. They are meant to teach language learners how to use a word correctly. Valency, which describes the arguments of a verb syntactically and semantically, is of unique importance to pedagogical dictionaries. Unfortunately, Arabic lacks corpus-based valency resources. Thus, this paper proposes a monolingual corpus-based valency dictionary, for Arabic learners, covering fighting verbs. The dictionary explores the valency of fighting verbs in Sketch Engine's uploaded Arabic TenTen corpus. The dictionary compiling method depends on both automatic word sketch function to identify the lexico-syntactic patterns of verbs and on three-layer manual annotation of corpus-driven examples to consolidate the results. Each verb entry, in the dictionary, displays (a) number; (b) phrase type; (c) semantic role; (d) grammatical function of its arguments and (e) definition of its different senses. At least, three annotated examples are provided for each verb sense to illustrate its usage authentically. The dictionary, integrating semantic and syntactic information, facilitates effective learning of new Arabic vocabulary.
Valency lexicons are valuable resources for natural language processing. The need for new resources for languages encourages researchers to collect new datasets. One of the most important datasets is valency lexicons. In valency lexicons, information about obligatory and optional complements of words is annotated at the syntactic and semantic levels. In this paper, we report the development of the first syntactic valency lexicon of Persian verbs. This lexicon is part of the Persian Dependency Treebank Project. The lexicon consists of 4282 distinct verb lemmas and 5429 distinct verb-valency pairs.
Lexical Functional Grammar (LFG) plays a vital role in the area of Natural Language Processing (NLP). LFG is considered as the constraint-based philosophy of grammar. C-structure and F-structure are the two basic forms of LFG. We have perceived from the existing literature that LFG has not studied in details; the reason that encouraged us to work on this study. This study highlights the brief history of LFG along with its architecture. Arabic language along with its parsing techniques is demonstrated. Moreover, this study addresses the efforts that LFG played in resolving various NLP issues. New trends have been triggered while conducting this survey and have been demonstrated for pursuing further research.
2010
MAGEAD is a morphological analyzer and generator for Modern Standard Arabic (MSA) and its dialects. We introduced MAGEAD in previous work with an implementation of MSA and Levantine Arabic verbs. In this paper, we port that system to MSA nominals (nouns and adjectives), which are far more complex to model than verbs. Our system is a functional morphological analyzer and generator, i.e., it analyzes to and generates from a representation consisting of a lexeme and linguistic feature-value pairs, where the features are syntactically (and perhaps semantically) meaningful, rather than just morphologically. A detailed evaluation of the current implementation comparing it to a commonly used morphological analyzer shows that it has good morphological coverage with precision and recall scores in the 90s. An error analysis reveals that the majority of recall and precision errors are problems in the gold standard or a result of the discrepancy between different models of form-based/functional morphology.
Proceedings of the Seventh conference on …, 2010
In this article I present a lexicon for Arabic verbs which exploits Levin's verb-classes (Levin, 1993) and the basic development procedure used by (Schuler, 2005). The verb lexicon in its current state has 173 classes which contain 4392 verbs and 498 frames providing information about verb root, the deverbal form of the verb, the participle, thematic roles, subcategorisation frames and syntactic and semantic descriptions of each verb. The taxonomy is available in XML format. It can be ported to MYSQL, YAML or JSON and accessed either in Arabic characters or in the Buckwalter transliteration.
WoLeR 2011 at ESSLLI International Workshop on Lexical Resources – Ljubljana, Slovenia, 2011
We describe a lexicon of Arabic verbs constructed on the basis of Semitic patterns and used in a resource-based method of morphological annotation of written Arabic text. The annotated output is a graph of morphemes with accurate linguistic information. An enhanced FST implementation for Semitic languages was created. This system is adapted also for generating inflected forms. The language resources can be easily updated. The lexicon is constituted of 15 400 verbal entries. We propose an inflectional taxonomy that increases the lexicon readability and maintainability for Arabic speakers and linguists. Traditional grammar defines inflectional verbal classes by using verbal pattern-classes and root-classes, related to the nature of each of the triliteral root-consonants. Verbal pattern-classes are clearly defined but root-classes are complex. In our taxonomy, traditional pattern-classes are reused and root-classes are simply redefined. Our taxonomy provides a straightforward encoding scheme for inflectional variations and orthographic adjustments due to assimilation and agglutination. We have tested and evaluated our resource against 10 000 diacriticized verb occurrences in the Nemlar corpus and compared it to Buckwalter resources. The lexical coverage is 99.9 % and a laptop needs two minutes in order to generate and compress the inflected lexicon of 2.5 million forms into 4 Megabytes.
2016
Idafa in traditional Arabic grammar is an umbrella construction that covers several phenomena including what is expressed in English as noun-noun compounds and Saxon and Norman genitives. Additionally, Idafa participates in some other constructions, such as quantifiers, quasi-prepositions, and adjectives. Identifying the various types of the Idafa construction (IC) is of importance to Natural Language processing (NLP) applications. Noun-Noun compounds exhibit special behavior in most languages impacting their semantic interpretation. Hence distinguishing them could have an impact on downstream NLP applications. The most comprehensive syntactic representation of the Arabic language is the LDC Arabic Treebank (ATB). In the ATB, ICs are not explicitly labeled and furthermore, there is no distinction between ICs of noun-noun relations and other traditional ICs. Hence, we devise a detailed syntactic and semantic typification process of the IC phenomenon in Arabic. We target the ATB as a ...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Journal of Applied Language and Culture Studies, 2018
HAL (Le Centre pour la Communication Scientifique Directe), 2014
الفعل في العربية بين الصرف والدلالة والتركيب, 2024