Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1998, IEEE Transactions on knowledge and Data …
Accessing many data sources aggravates problems for users of heterogeneous distributed databases. Database administrators must deal with fragile mediators, that is, mediators with schemas and views that must be significantly changed to incorporate a new data source. When implementing translators of queries from mediators to data sources, database implementors must deal with data sources that do not support all the functionality required by mediators. Application programmers must deal with graceless failures for unavailable data sources. Queries simply return failure and no further information when data sources are unavailable for query processing. The Distributed Information Search COmponent (DISCO) addresses these problems. Data modeling techniques manage the connections to data sources, and sources can be added transparently to the users and applications. The interface between mediators and data sources flexibly handles different query languages and different data source functionality. Query rewriting and optimization techniques rewrite queries so they are efficiently evaluated by sources. Query processing and evaluation semantics are developed to process queries over unavailable data sources. In this article, we describe 1) the distributed mediator architecture of DISCO; 2) the data model and its modeling of data source connections; 3) the interface to underlying data sources and the query rewriting process; and 4) query processing semantics. We describe several advantages of our system.
Knowledge Representation Meets Databases, 1997
For advanced data-oriented applications indistributed environments, effective informationis frequently obtained by integratingor fusing various autonomous informationsources. There are many problems: how toresolve their heterogeneity, how to integratetarget sources, how to represent informationsources with a common protocol, and how toprocess queries. In this paper, we propose anew language, QUIK, as an extension of adeductive object-oriented database (DOOD)language, QUIXOT E, and...
Proceedings 11th Australasian Database Conference. ADC 2000 (Cat. No.PR00528), 1999
The need to transform queries between heterogeneous databases arises from many circumstances, including query transformation between a global and a local query. This paper describes a query mediation approach to the interoperability of heterogeneous databases. We develop a query mediation architecture that facilitates automated query transformation, which is applicable in many circumstances, such as query transformation in a federated database environment. Different application languages are transformed into intermediate representations that can be processed by the query mediator. The main characteristic of our system is that the query transformation agent, which acks as a mediator, is used to support automated query transformation that protects users from the troublesome task of semantic and representational discrepancies. Two query languages, ODMG-OQL and SQL have been used to investigate the query mediation architecture.
9th Asia-Pacific Conference on Communications (IEEE Cat. No.03EX732), 2003
In this paper, we propose an intelligent mediator that is based on XML technology for heterogeneous data source. We call it an XML Mediator or XMed for short. XMed is expected to reduce the overhead that may arise when adding a new data source and to allow uniform processing of data that may be received from different data sources. It is also to allow for dynamic connectivity to accessible data sources provided that the connection protocol is either already available with it or it can be uploaded dynamically. XMed receives requests from the client, analyzes, and forwards them to their corresponding destinations. Results are wrapped in XML and sent back to their corresponding client. It manages the users' accounts, authorization and security to facilitate the connection to the desired data source remotely. It is presumed that XMed will fulfill the role of an open system by exposing its services to any developed application. XMed maintains a knowledge base of user histories and metadata about database connections and queries. The system uses its knowledge base to assist users when accessing desired data or running queries.
… -Based Planning and …, 1994
A critical problem in building an information mediator is how to translate a domain-level queries into an e cient query plan for accessing the required data. We h a ve built a exible and e cient information mediator, called SIMS. SIMS takes a domain-level query and dynamically selects the appropriate information sources based on their content a n d a vailability, generates a query access plan that speci es the operations and their order for processing the data, and then performs semantic query reformulation to minimize the overall execution time. This paper describes these three basic components of the query processing in SIMS. Execution Learning Domain Model Database Models Output User's Query Data and Knowledge Bases Information Source Selection Query Access Planning Semantic Query Reformulation Figure 1: The SIMS Architecture 2 Representing the Knowledge of a Mediator
Proceedings of …, 2002
Mediator systems aim to provide an unified global data schema over distributed heterogeneous structured and semi-structured data sources. These systems must deal with limitations on the query capabilities of the sources. This paper introduces a new framework for representing source query capability along with the algorithms needed to compute the query capabilities of the global schema from sources. Our approach for computing query capabilities is able to support a richer capabilities representation framework than the ones previously presented in the literature. We show that those approaches are insufficient to properly represent many real sources, and how our approach can solve those limitations.
Distributed and Parallel Databases, 1998
Users today are struggling to integrate a broad range of information sources providing different levels of query capabilities. Currently, data sources with different and limited capabilities are accessed either by writing rich functional wrappers for the more primitive sources, or by dealing with all sources at a “lowest common denominator”. This paper explores a third approach, in which a mediator ensures that sources receive queries they can handle, while still taking advantage of all of the query power of the source. We propose an architecture that enables this, and identify a key component of that architecture, the Capabilities-Based Rewriter (CBR). The CBR takes as input a description of the capabilities of a data source, and a query targeted for that data source. From these, the CBR determines component queries to be sent to the sources, commensurate with their abilities, and computes a plan for combining their results using joins, unions, selections, and projections. We provide a language to describe the query capability of data sources and a plan generation algorithm. Our description language and plan generation algorithm are schema independent and handle SPJ queries. We also extend CBR with a cost-based optimizer. The net effect is that we prune without losing completeness. Finally we compare the implementation of a CBR for the Garlic project with the algorithms proposed in this paper.
Lecture Notes in Computer Science, 2002
Proceedings of the IEEE, 1987
Mermaid is a system that allows the user of multiple databases stored under various relational DBMSs running on different machines to manipulate the data using a common language, either ARIEL or SQL. It makes the complexity of this distributed, heterogeneous data processing transparent to the user. In this paper, we describe the architecture, system control, user interface, language and schema translation, query optimization, and network operation of the Mermaid system. Future research issues are also addressed.
Nigerian Annals of Pure and Applied Sciences, 2018
This study developed and implemented an improved mediator wrapper approach to addressing the challenges of integration of semantic heterogeneous databases. It employed Local as View (LaV) paradigm of database integration so as to reduce its cost as well as offer the local sources a degree of independence. The developed model was implemented as web service using JEE and some software development tools with a number of heterogeneous databases as a case study.
Proceedings of the IEEE, 2000
system. Future research issues are also addressed.
Future Generation Computer Systems, 2005
Across a wide variety of fields, huge datasets are being collected and accumulated at a dramatical pace. The datasets addressed by individual applications are very often heterogeneous and geographically distributed. An important factor of the strength of a modern enterprise is its capability to effectively store and process information. As a legacy of the computing trend in recent decades, large enterprises often have many isolated data repositories used only within portions of the organization. From organizational/technical reasons, such isolated systems still emerge within different parts of the enterprises. While these systems support the individual enterprise units, their inability to interoperate is hindering the evolution of a system which provides the user with a unified information model of the data available in the whole organization. This problem gets even more reinforced by applying it onto the emerging Grid concept, in order to provide a uniform information model over the shared resources of the participating enterprises in the virtual organization. Several technical obstacles arise in the design and implementation of a system for integration of such data sources, most notably distribution, autonomy, and data heterogeneity. This technical report presents a data integration system based on the wrapper-mediator approach, namely the Grid Data Mediation Service. The alternative of forcing each grid application to interface directly to a set of databases and resolve federation problems internally would lead to application complexity, and duplication of effort. We describe our extensions and improvements to the reference implementation of the OGSA-DAI Grid Data Service prototype, an infrastructure that allows remote access to Grid databases, in order to provide a Virtual Data Source-a clean abstraction of heterogeneous/distributed data for users and applications. The open architecture (in means of the integratable data sources and operations) is implemented in a first prototype to show the feasibility of the developed concepts as well as their applicability to real needs and problems. We developed a flexible mapping schema to describe the building process of a virtual data source with operators to query data sources and combine the results via operators for join and union. Integrable data sources include relational databases, native XML databases and files in the comma separated value format. Usability and integratability was an big issue during the design phase-so we decided to support a subset of the well known Structured Query Language (SQL) to formulate query against our global schema. By describing general applicable access scenarios we are showing, to our best knowledge, the great need for the first Grid data mediation service as well as the compliance with important requirements of virtual data sources.
Montreal, Canada, 1996
2014
In the past, to answer a user query, we generally e xtract data from one centralized database or from m ultiple sources with the same structure. then things have b een changed and we are facing the fact that in some cases, it is necessary to use a set of data sources to provide a complete information. these sources a r physically separated, but they are logically seen a s a single component to the final user. Besides the structure heterogeneity, there is another important point for what specialists are trying to find a so luti n which is the semantic heterogeneity of data sources . In this paper we are going to provide a list of d ifferent approaches that treated the query processing proble m on heterogeneous data sources under different ang les
2010
Distributed heterogeneous data sources need to be queried uniformly using global schema. Query on global schema is reformulated so that it can be executed on local data sources. Constraints in global schema and mappings are used for source selection, query optimization, and querying partitioned and replicated data sources. The provided system is all XML-based which poses query in XML form, transforms, and integrates local results in an XML document. Contributions include the use of constraints in our existing global schema which help in source selection and query optimization, and a global query distribution framework for querying distributed heterogeneous data sources.
As the number of databases accessible on the Web grows, the ability to execute queries spanning multiple heterogeneous queryable sources is becoming increasingly important. To date, research in this area has focused on providing semantic completeness, and has generated solutions that work well when querying over a relatively small number of databases that have static and well-defined schemas. Unfortunately, these solutions do not extend to the scale of the present Internet, let alone the Internet of the future. In this paper, we present an approach that makes the opposite tradeoff: it provides a scalable, unified view over large numbers of queryable information sources by sacrificing some expressive power in the set of queries supported. We have developed a prototype system, IDB, which implements this approach. The IDB system provides scalability through three main techniques. First, it uses a collection of ontologies organized into hierarchical namespaces as a medium for expressing data semantics. Second, it employs a declarative query language to describe information sources so that source descriptions can be "executed" at run time instead of being pre-compiled into the system. Third, it utilizes inverted-index style operations to identify the subset of information sources that are relevant to a particular user query. We describe the design, architecture, and implementation of IDB and illustrate its use through case examples.
1997
Abstract Semantic query optimization is the process of transforming a query issued by a user into a different query which, because of the semantics of the application, is guaranteed to yield the correct answer for all states of the database. While this process has been successfully applied in centralised databases, its potential for distributed and heterogeneous systems is enormous, as there is the potential to eliminate inter-site joins which are the single biggest cost factor in query processing.
2004
The facility that allows users to query heterogeneous and distributed data repositories the majority of users. Moreover, that facility should hide all the intrinsic problems related to the context such as location, organization/structure, query language and semantics of the data in various repositories. In this paper a query processing strategy that focuses on information content is presented. For that, domain speci c ontologies that capture the information content of data repositories are used. Those ontologies are described using a system based on Description Logics. It is also explained the di erent steps followed to provide incremental answers to user queries navigating across ontologies, using pre-de ned semantic interontology relationships. Description Logics systems capabilities are exploited to optimize queries and guide the query processing.
PROCEEDINGS OF THE …, 1997
2009
In the Web environment, rich, diverse sources of heterogeneous and distributed data are ubiquitous. In fact, even the information characterizing a single entity - like, for example, the information related to a Web service - is normally scattered over various data sources using various languages such as XML, RDF, and OWL. Hence, there is a strong need for Web applications to handle queries over heterogeneous, autonomous, and distributed data sources. However, existing techniques do not provide sufficient support for this task. In this paper we present DeXIN, an extensible framework for providing integrated access over heterogeneous, autonomous, and distributed web data sources, which can be utilized for data integration in modern Web applications and Service Oriented Architecture. DeXIN extends the XQuery language by supporting SPARQL queries inside XQuery, thus facilitating the query of data modeled in XML, RDF, and OWL. DeXIN facilitates data integration in a distributed Web and Service Oriented environment by avoiding the transfer of large amounts of data to a central server for centralized data integration and exonerates the transformation of huge amount of data into a common format for integrated access.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.