Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2014, Proceedings of the EMNLP 2014 Workshop on Arabic Natural Language Processing (ANLP)
…
6 pages
1 file
Traditional keyword based search is found to have some limitations. Such as word sense ambiguity, and the query intent ambiguity which can hurt the precision. Semantic search uses the contextual meaning of terms in addition to the semantic matching techniques in order to overcome these limitations. This paper introduces a query expansion approach using an ontology built from Wikipedia pages in addition to other thesaurus to improve search accuracy for Arabic language. Our approach outperformed the traditional keyword based approach in terms of both F-score and NDCG measures.
International Journal of Advanced Computer Science and Applications
The semantic resources are important parts in the Information Retrieval (IR) such as search engines, Question Answering (QA), etc., these resources should be available, readable and understandable. In semantic web, the ontology plays a central role for the information retrieval, which use to retrieves more relevant information from unstructured information. This paper presents a semantic-based retrieval system for the Arabic text, which expands the input query semantically using Arabic domain ontology. In the proposed approach, the search engine index is represented using Vector Space Model (VSM), and the Arabic's place nouns domain ontology has been used which constructed and implemented using Web Ontology Language (OWL) from Arabic corpus. The proposed approach has been experimented on the Arabic Quran corpus, and the experiments show that the approach outperforms in terms of both precision and recall the traditional keywordbased methods.
Journal of Big Data
Introduction Information retrieval (IR) is an active research field that aims at extraction of the most relevant documents from large datasets. User query plays an important role in this process. A numerous efforts have been done to retrieve the relevant documents which are written in English language. Nevertheless, Arabic language has not received the deserved effort due to some inherent difficulties with the language itself. In fact, Arabic language is one of the richest human languages in its terms, varieties of sentence constructions, and diversity of meaning [1]. The sentence in Arabic language is made up of interconnected terms based on grammatical relation [2-4]. User query in most cases is too short which may neither be sufficient nor effective enough to express what the user needs [2]. Vocabulary mismatch is one of the most critical issues in IR where the user and indexer use different terms [5, 6]. Consequently, IR systems could not retrieve the documents which match the user needs. A well-known and effective strategy to resolve this issue is to perform query expansion (QE).
IFIP Advances in Information and Communication Technology , 2012
This research suggests a method for query expansion on Arabic Information Retrieval using Expectation Maximization (EM). We employ the EM algorithm in the process of selecting relevant terms for expanding the query and weeding out the non-related terms. We tested our algorithm on INFILE test collection of CLLEF2009, and the experiments show that query expansion that considers similarity of terms both improves precision and retrieves more relevant documents. The main finding of this research is that we can increase the recall while keeping the precision at the same level by this method.
2013
Millions of users search daily for their needs using internet and other information stores, they search by writing their queries. Unfortunately, these queries may fail to reach to their needs, this fail known as word mismatch. One way of handling this Word mismatch is by using a thesaurus, that shows (usually semantic) the relationships between terms. The main goal of this study is to design and build an automatic Arabic thesaurus using Local Context Analysis technique that can be used in any special field or domain to improve the expansion process and to get more relevance documents for the user's query. This technique can be used in any special field or domain to improve the expansion process and to get more relevant documents for the user's query. Results of this study were compared with the classical information retrieval system. Two hundred and forty two Arabic documents and 59 Arabic queries were used for building the requirements of the thesaurus, such as inverted Fil...
2018
Information retrieval aims to find all relevant documents responding to a query from textual data. A goodinformation retrieval system should retrieve only those documents that satisfy the user query. Although several models weredeveloped, most of Arabic information retrieval models do not satisfy the user needs. This is because the Arabic language ismore powerful and has complex morphology as well as high polysemy. This paper first investigates the most recent Arabicinformation retrieval model and then presents two different approaches to enhance the effectiveness of the adopted model.The main idea of the proposed approaches is to modify and/or expand the user query. The first approach expands user queryby using semantics of words according to an Arabic dictionary. The second approach modifies and/or expands user query byadding some useful information from the pseudo relevance feedback. In other words, the query is modified by selectingrelevant textual keywords for expanding the que...
IRAQI JOURNAL OF SCIENCE, 2017
Query expansion (QE) is a successful idea to overcome the weaknesses in the information retrieval performance. The QE requires finding out appropriate word synonyms of the query words in a process that can be made automatically without any user intervention. The candidate synonyms should be associated with an accurate meaning (sense) of the original word. Arabic language is rich in multiple meanings and this requires using the so-called word sense disambiguation (WSD). WSD in general is a task to discover the correct sense of a word within context. To disambiguate the word sense, three different traditional semantic measures are tested in this work; they are called lch, wup, and path respectively. The proposed system uses these measures along with an automatic synonym selection method employed to expand the query. The proposed system outperforms the traditional baseline system that has no query expansion technique in a rate from 10% to 18 % and reduces the latency in an approximate rate from 0.232 to 0.283 second for each query.
American Journal of Applied Sciences, 2013
The word mismatch problem is fundamental to Information retrieval. Query expansion process helps to overcome this problem. Based on the Arabic corpuses, the comparisons between two query expansion techniques (global and local query) have been conducted to determine the query effectiveness. First one represents the local context analysis which represents a local method, while a global method was the second technique that has been represented by the Association and similarity thesauruses. These techniques can be used in any special field or domain to improve the expansion process and to get more relevant documents for the user's query. This study introduces a comparison between these approaches and shows their effectiveness. Although, local context analysis has some advantages over the similarity thesaurus, Association thesaurus which is global is generally the most effective one.
2006
In both writing and conversation, different people may use different terminologies for the same concept. The same situation could be generalized on issuing queries for search engines and digital libraries. Arabic Information Retrieval (IR) systems practices are still based on word-matching rather than word-sense approaches. This paper addresses some characteristics of Arabic language text properties and its computer processing, in addition to a general idea about synonyms facility and its current implementation fields in IT. The study exhibits an implementation model for a new IR system using additional components like Arabic light stemmers and word synonyms structure which assist in solving some limitations that today's Arabic IR systems suffer from. The study recommends the use of word stemming and wildcard search modules to solve the word scripts mismatching problem which arise with word-matching approach. In addition, it utilizes the synonyms facility in order to expand the queries in word-sense approach.
2015
Accurate information availability is a key factor for knowledge acquisition without going into extraneous information. Understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace is a challenge that has been addressed and handled by many semantic search engines. As meaning encoded separately from data in semantic technology, adding, changing and implementing new relationships can be done easily. The evolution of semantic search added a new dimension of challenge due to a lack in support of the Arabic language. In this paper, we figure out the problem and implement a Semantic Search Engine (CASEng) for College of Applied Sciences, Oman. CASEng supports both Arabic and English search. It uses a Resource Description Framework (RDF) data and Lucene for indexing and searching to move from keyword-based search via Google and other engines to semantics-based search. The experiments show that both the spell-checker and the search engine perfo...
One of the major problems of modern Information Retrieval (IR) systems is the vocabulary problem that concerns the discrepancies between terms used for describing documents and the terms used by the researchers to describe their information need. One way of handling the vocabulary problem is by using a thesaurus (usually semantic), that shows the relationships between terms.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Business Intelligence: Concepts, Methodologies, Tools, and Applications, 2000
Arabian Journal for Science and Engineering, 2018
Lecture Notes in Computer Science, 2015
International Florida Artificial Intelligence Research Society Conference (FLAIRS), 2015
Journal of Information Science Theory and Practice, 2020
World Applied Sciences …, 2009