Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008, Proceedings of the American Society for Information Science and Technology
Web search engines are now major tools that people use to find information on the Web. Few studies have examined the extent and functionality of these Web searching tools that support Spanish-speaking users. Our study identified 72 publically available Spanish-language-supporting Web search engines. The research questions are the following: a) What Web searching tools are available to Spanish speaking users?, b) What are the features of these tools?, c) What are the current web search engine restrictions? d) What are the functionalities of multimedia Web search engines that allow searching in Spanish? The Web functionality methodology developed by Tjondronegoro and Spink was adapted and modified to examine the functionality of Spanish Web search engines. The findings include: (1) limited set of Web search features for Spanish-speaking users, (2) a lack of Spanish language interfaces, (3) lack of functionality in 18% of the Web searching tools examined, and (4) due to the high interest in Spanish language media, a need for multimedia Web search tools to support Spanish language users. Further research is needed to examine the needs of Spanish language users and the development of Spanish language Web search tools.
gandalf.aksis.uib.no
This work describes a proposal to improve web document retrieval by facing the main problems in document searching: first, traditional web search engines miss documents that are relevant to the user query and retrieve many that are not. Second, the query formulation is not as accessible as it could be, and some users have difficulties in expressing boolean queries. To improve the quality of Internet search engines, two main approaches have typically been adopted: One is the creation of a metasearch engine that makes use of multiple search engines by unifying both the query language and the type of results returned by the different search engines; the other one involves applying NLP techniques for query extensions in order to handle morphological, lexical, semantic and syntactic variations. Focusing on the second approach, we present the research project MESIA (project CAM 07T/0017/1998) for the Madrid Local Government web site (www.comadrid.es). Its main goal is to exploit general purpose linguistic resources to extend user queries in order to enhance the answers provided by AltaVista search engine.
ra.ethz.ch
In this paper we present a new online search engine develo-ped in Chile: Inquiro.CL. This new search engine lets users search the Latin-American web (specifically, ten Spanish-speaking countries) by specifying their search domain (.cl, .com.ar, .com.mx, ...
Purpose – To test the ability of major search engines, Google, Yahoo, MSN, and Ask, to distinguish between German and English-language documents Design/methodology/approach – 50 queries, using words common in German and in English, were posed to the engines. The advanced search option of language restriction was used, once in German and once in English. The first 20 results per engine in each language were investigated. Findings – While none of the search engines faces problems in providing results in the language of the interface that is used, both Google and MSN face problems when the results are restricted to a foreign language. Research limitations/implications – Search engines were only tested in German and in English. We have only anecdotal evidence that the problems are the same with other languages. Practical implications – Searchers should not use the language restriction in Google and MSN when searching for foreign-language documents. Instead, searchers should use Yahoo or Ask. If searching for foreign language documents in Google or MSN, the interface in the target language/country should be used. Value of paper – Demonstrates a problem with search engines that has not been previously investigated Keywords – World Wide Web, search engines, advanced search options, language restriction
2011
This study is about the information-seeking behavior that bilingual users -specifically, native Chinese speakers whose second language is English -exhibit when performing an online search.
2003
This paper presents the case for a search engine for the Portuguese Community. Preservation of publications of historical interest for future access, obtaining knowledge about the preferences and interests of our society in the information age and intelligence gathering for security and protection are examples of national interests addressed by such system. This paper also introduces the architecture and design of tumba!, a new Internet search engine for the Portuguese Web. Tumba!
Language Resources and Evaluation, 2008
Morphological query expansion and language-filtering words have proved to be valid methods when searching the web for content in Basque via APIs of commercial search engines, as the implementation of these methods in recent IR and web-as-corpus tools shows, but no real analysis has been carried out to ascertain the degree of improvement, apart from a comparison of recall and precision using a classical web search engine and measured in terms of hit counts. This paper deals with a more theoretical study that confirms the validity of the combination of both methods. We have measured the increase in recall obtained by morphological query expansion and the increase in precision and loss in recall produced by language-filtering-words, but not only by searching the web directly and looking at the hit counts -which are not considered to be very reliable at best-, but also using both a Basque web corpus and a classical lemmatised corpus, thus providing more exact quantitative results. Furthermore, we provide various corpora-extracted data to be used in the aforementioned methods, such as lists of the most frequent inflections and declinations (cases, persons, numbers, times, etc.) for each POS -the most interesting word forms for a morphologically expanded query-, or a list of the most used Basque words with their frequencies and document-frequencies -the ones that should be used as language-filtering words-.
Proceedings from the Corpus Linguistics Conference Series, Vol. 1, no.1, University of Birmingham., 2005
The web has unique potential among corpora to yield large-volume data on up-to-date language use, obvious shortcomings notwithstanding. Since 1998, we have been developing a tool, WebCorp, to allow corpus linguists to retrieve raw and analysed linguistic output from the web. Based on internal trials and user feedback gleaned from our site (http://www.webcorp.org.uk/), we have established a working system which supports thousands of regular users world-wide. Many of the problems associated with the nature of web text have been accommodated, but problems remain, some due to the nonimplementation of standards on the Internet, and others to reliance on commercial search engines, which mediation slows up average WebCorp response time and places constraints on linguistic search. To improve WebCorp performance, we are in the process of creating a tailored search engine, an infrastructure in which WebCorp will play an integral and enhanced role.
2008
Web research in Mexico has been addressing issues related mainly to search mechanisms, information extraction, and mediating user interaction and group collaboration. In this paper we provide an overview of representative projects in the area and present a sample of recent advances by research groups in Mexican institutions. These include initiatives aimed to exploring extraction techniques that regard the web as a corpus, indexing and categorizing multimedia web contents, and designing user interfaces for visualizing web-accessible collections as well as environments for synchronous and asynchronous web collaboration.
Acta Cybernetica, 2007
In this paper we present the result of our project that aims to build a categorization-based topic-oriented Internet search engine. Particularly, we focus on the economic related electronic materials available on the Internet in Hungarian. We present our search service that harvests, stores and makes searchable the publicly available contents of the subject domain. The paper describes the search facilities and the structure of the implemented system with special emphasis on intelligent search algorithms and document processing methods.
Proc. of ACM SIGIR 2007 Workshop on Improving Non-English Web Searching (iNEWS07), Amsterdam, The Netherlands, 2007. ISBN 978-84-690-6978-3 (78 pp.), 2007
Conference (SIGIR'07) aiming at bringing together researchers interested in non-English web searching.
Information Retrieval, 12(3):230-250, 2009. ISSN 1386-4564. DOI 10.1007/s10791-009-9093-0, 2009
With increasingly higher numbers of non-English language web searchers the problems of efficient handling of non-English Web documents and user queries are becoming major issues for search engines. The main aim of this review paper is to make researchers aware of the existing problems in monolingual non-English Web retrieval by providing an overview of open issues. A significant number of papers are reviewed and the research issues investigated in these studies are categorized in order to identify the research questions and solutions proposed in these papers. Further research is proposed at the end of each section.
ACM SIGIR Forum, 41(2):72-76, 2007. ISSN 0163-5840., 2007
Conference (SIGIR'07) aiming at bringing together researchers interested in non-English web searching.
2008
07). The workshop aims at bringing together researchers interested in the issues surrounding non-English web searching. Nowadays, over 60% of Internet users are non-English speakers and the number of non-English speaking Internet users is growing faster than the English speaking. Recent studies showed that non-English queries and unclassifiable queries have nearly tripled since 1997. Most search engines were originally engineered for English. These do not take into full account the specifics of non-English languages, such as, inflectional semantics nor diacritics or capitalization. The main conclusion from the literature is that searching using non-English and non-Latin based queries results in lower retrieval success and requires additional user effort so as to achieve acceptable satisfaction levels. Furthermore, international search engines, like MSN Live, Google and Yahoo, are relatively weaker with monolingual non-English queries. New tools and resources are needed to support researchers in non-English retrieval, new methodologies need to be proposed which will help the identification of problems in existing search engines and new teaching strategies should be formed aiding users to become more efficient in formulating their queries. Foremost, research in non-English web search should provide an incentive to search engines to improve the retrieval performance of their engines for non-English languages.
Proceedings of the 24th International Conference on World Wide Web - WWW '15 Companion, 2015
Online searching is a central element of internet users' information behaviors. Searching is usually executed in a user's native language, but searching in English as a foreign language is often necessitated by the lack of content in languages that are underrepresented in Web content. This paper reports results from a study of searching in English as a foreign language and aims at understanding this particular group of users' behaviors. Searchers whose native language is not English may have to resort to queries in English in support of their information needs due to the lack or low quality of the web content in their own language. However, when searching for information in a foreign language, users face a unique set of challenges that are not present for native language searching. We studied this problem through qualitative research methods and report results from focus groups in this paper. The results reported in this paper describe typical problems foreign language searchers face, the differences in informationseeking behavior in English and in the participants' native language, and advice and ideas shared by the focus group participants about how to search effectively and efficiently in English.
2006
In the complex decision-environments that characterize e-business settings, it is important to permit decision-makers to proactively manage data quality. In this paper we propose a decision-support framework that permits decision-makers to gauge quality both in an objective (context-independent) and in a context-dependent manner. The framework is based on the information product approach and uses the Information Product Map (IPMAP). We illustrate its application in evaluating data quality using completeness-a data quality dimension that is acknowledged as important. A decision-support tool (IPView) for managing data quality that incorporates the proposed framework is also described.
Proceedings of the ASIST Annual Meeting, 1996
Three Web search engines, namely, Alta Vista, Excite, and Lycos, were compared and evaluated in terms of their search capabilities (e.g., Boolean logic, truncation, field search, word and phrase search) and retrieval performances (i.e., precision and response time) using sample queries drawn from real reference questions. Recall, the other evaluation criterion of information retrieval, is deliberately omitted from this study because it is impossible to assume how many relevant items there are for a particular query in the huge and ever changing Web system. The authors of this study found that Alta Vista outperformed Excite and Lycos in both search facilities and retrieval performance although Lycos had the largest coverage of Web resources among the three Web search engines examined. As a result of this research, we also proposed a methodology for evaluating other Web search engines not included in the current study. 6. Out of the 10 downloaded Web records, there are 2 duplicates. 7. All the figures were obtained in January 1996.
2010
Search engines are competing to be seen as universal, consistent and language independent. In principle, users searching for information through the Internet should get consistent information regardless of the language of the words they are searching for and regardless of the language of the matching or the relevant documents. Nevertheless, the language should affect the sequence or the ranking of the retrieved results. In this project, several tools are built to evaluate words and statements in several languages. Results are evaluated and compared for possible correlation. Another tool is built to crawl websites from different languages and locations in order to measure several aspects of those websites. Results from both studies showed that while it seems that popular search engines are making very good progress toward building search engines that are language and location independent, however, there are some limitations and situations where search for results can be biased toward the popularity of the website language and/or location.
Text Retrieval Systems (TRS) are a well-known type of program in the sphere of information and documentation, especially as they are designed for the retrieval of text and cognitive documents. The main characterisitcs can be summarized as follows: they have a flexible record model (variable length fields, multiple value fields, etc.), they facilitate access to logs through reverse indexing, contain a varied set of data-recovery features, and are provided with diverse instruments for terminology control. Some of the best known and prevalent systems are CDS/ISIS, FileMaker, Knosys, and Inmagic DB/Text. Global theories of approach have been developed about them, among which we can highlight that of Sieverts and other Belgian researchers (1991-93), authors of a series of very complete and exhaustive articles that describe the characterisitics of these types of programs, elaborating a typology and presenting a very detailed evaluation of some thirty products. Subsequently, William Saffad...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.