Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2008, Proceedings of the twenty-seventh ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems - PODS '08
A schema mapping is a high-level specification that describes the relationship between two database schemas. As schema mappings constitute the essential building blocks of data exchange and data integration, an extensive investigation of the foundations of schema mappings has been carried out in recent years. Even though several different aspects of schema mappings have been explored in considerable depth, the study of schema-mapping optimization remains largely uncharted territory to date.
2009
We present a new method for computing core universal solutions in data exchange settings specified by source-to-target dependencies, by means of SQL queries. Unlike previously known algorithms, which are recursive in nature, our method can be implemented directly on top of any DBMS. Our method is based on the new notion of a laconic schema mapping. A laconic schema mapping is a schema mapping for which the canonical universal solution is the core universal solution. We give a procedure by which every schema mapping specified by FO s-t tgds can be turned into a laconic schema mapping specified by FO s-t tgds that may refer to a linear order on the domain of the source instance. We show that our results are optimal, in the sense that the linear order is necessary and the method cannot be extended to schema mapping involving target constraints.
The VLDB Journal, 2011
Schema mappings are high-level specifications that describe the relationship between two database schemas. They are an important tool in several areas of database research, notably in data integration and data exchange. However, a concrete theory of schema mapping optimization including the formulation of optimality criteria and the construction of algorithms for computing optimal schema mappings is completely lacking to date. The goal of this work is to fill this gap. We start by presenting a system of rewrite rules to minimize sets of source-to-target tuple-generating dependencies (st-tgds, for short). Moreover, we show that the result of this minimization is unique up to variable renaming. Hence, our optimization also yields a schema mapping normalization. By appropriately extending our rewrite rule system, we also provide a normalization of schema mappings containing equality-generating targetdependencies (egds). An important application of such a normalization is in the area of defining the semantics of query answering in data exchange, since several definitions in this area depend on the concrete syntactic representation of the st-tgds. This is, in particular, the case for queries with negated atoms and for aggregate queries. The normalization of schema mappings allows us to eliminate the effect of the concrete syntactic representation of the st-tgds from the semantics of query answering. We discuss in detail how our results can be fruitfully applied to aggregate queries.
2013
Abstract A schema mapping is a formal specification of the relationship holding between the databases conforming to two given schemas, called source and target, respectively. While in the general case a schema mapping is specified in terms of assertions relating two queries in some given language, various simplified forms of mappings, in particular lav and gav, have been considered, based on desirable properties that these forms enjoy.
The VLDB Journal, 2012
The inversion of schema mappings has been identified as one of the fundamental operators for the development of a general framework for metadata management. During the last few years, three alternative notions of inversion for schema mappings have been proposed (Fagin-inverse (Fagin, TODS 32(4), 25:1-25:53, 2007), quasi-inverse (Fagin et al., TODS 33(2), 11:1-11:52, 2008), and maximum recovery (Arenas et al., TODS 34(4), 22:1-22:48, 2009)). However, these notions lack some fundamental properties that limit their practical applicability: most of them are expressed in languages including features that are difficult to use in practice, some of these inverses are not guaranteed to exist for mappings specified with source-to-target tuple-generating dependencies (st-tgds), and it has been futile to search for a meaningful mapping language that is closed under any of these notions of inverse. In this paper, A preliminary version of this article appeared in PVLDB [3]. Electronic supplementary material The online version of this article (we develop a framework for the inversion of schema map-pings that fulfills all of the above requirements. It is based on the notion of C-maximum recovery, for a query language C, a notion designed to generate inverse mappings that recover back only the information that can be retrieved with queries in C. By focusing on the language of conjunctive queries (CQ), we are able to find a mapping language that contains the class of st-tgds, is closed under CQ-maximum recovery, and for which the chase procedure can be used to exchange data efficiently. Furthermore, we show that our choices of inverse notion and mapping language are optimal, in the sense that choosing a more expressive inverse operator or mapping language causes the loss of these properties.
1992
In this paper we introduce some terminology for comparing the expressiveness of conceptual data modelling techniques, such as ER, NIAM, and PM, that are finitely bounded by their underlying domains. Next we consider schema equivalence and discuss the effects of the sizes of the underlying domains. This leads to the introduction of the concept of finite equivalence. We give some examples of finite equivalence and inequivalence in the context of PM.
2013
Embracing Incompleteness in Schema Mappings Patricia C. Rodriguez-Gianolli Doctor of Philosophy Graduate Department of Computer Science University of Toronto 2013 Various forms of information integration have become ubiquitous in current Business Intelligence (BI) technologies. In many cases, the semantic relationship between heterogeneous data sources is specified using high-level declarative rules, called schema mappings. For decades, Skolem functions have been regarded as an important tool in schema mappings as they permit a precise representation of incomplete information. The powerful mapping language of second-order tuple generating dependencies (SO tgds) permits arbitrary Skolem functions and has been proven to be the right class for modeling many integration problems, such as composition and correlation of mappings. This language is strictly more powerful than the languages used in many integration systems, including source-to-target and nested tgds which are both first-orde...
ACM Transactions on Database Systems, 2011
Schema mappings are high-level specifications that describe the relationship between two database schemas; they are considered to be the essential building blocks in data exchange and data integration, and have been the object of extensive research investigations. Since in real-life applications schema mappings can be quite complex, it is important to develop methods and tools for understanding, explaining, and refining schema mappings. A promising approach to this effect is to use “good” data examples that illustrate the schema mapping at hand. We develop a foundation for the systematic investigation of data examples and obtain a number of results on both the capabilities and the limitations of data examples in explaining and understanding schema mappings. We focus on schema mappings specified by source-to-target tuple generating dependencies (s-t tgds) and investigate the following problem: which classes of s-t tgds can be “uniquely characterized” by a finite set of data examples?...
2011
Abstract A schema mapping is a formal specification of the relationship holding between the databases conforming to two given schemas, called source and target, respectively. While in the general case a schema mapping is specified in terms of assertions relating two queries in some given language, various simplified forms of mappings, in particular LAV and GAV, have been considered, based on desirable properties that these forms enjoy.
Theory of Computing Systems
Schema mappings have been extensively studied in the context of data exchange and data integration, where they have turned out to be the right level of abstraction for formalizing data inter-operability tasks. Up to now and for the most part, schema mappings have been studied as static objects, in the sense that each time the focus has been on a single schema mapping of interest or, in the case of composition, on a pair of schema mappings of interest. In this paper, we adopt a dynamic viewpoint and embark on a study of sequences of schema mappings and of the limiting behavior of such sequences. To this effect, we first introduce a natural notion of distance on sets of finite target instances that expresses how "close" two sets of target instances are as regards the certain answers of conjunctive queries on these sets. Using this notion of distance, we investigate pointwise limits and This article is part of the Topical Collection on Special Issue on Database Theory
Proceedings of the 31st symposium on Principles of Database Systems - PODS '12, 2012
Over the past several decades, the study of conjunctive queries has occupied a central place in the theory and practice of database systems. In recent years, conjunctive queries have played a prominent role in the design and use of schema mappings for data integration and data exchange tasks. In this paper, we investigate several different aspects of conjunctive-query equivalence in the context of schema mappings and data exchange.
Proceedings of the twenty- …, 2010
Conquering Complexity, 2012
This chapter introduces procedures to test strict satisfiability and decide logical implication for a family of database schemas called extralite schemas with role hierarchies. Using the OWL jargon, extralite schemas support named classes, datatype and object properties, minCardinalities and maxCardinalities, InverseFunc-tionalProperties, class subset constraints, and class disjointness constraints. Extralite schemas with role hierarchies also support subset and disjointness constraints defined for datatype and object properties. Strict satisfiability imposes the additional restriction that the constraints of a schema must not force classes, datatypes, or object properties to be always empty, and is therefore more adequate than the traditional notion of satisfiability in the context of database design. The decision procedures outlined in the chapter are based on the satisfiability algorithm for Boolean formulas in conjunctive normal form with at most two literals per clause, and explore the structure of a set of constraints, captured as a constraint graph. The chapter includes a proof that the procedures work in polynomial time and are consistent and complete for a subfamily of the schemas that impose a restriction on the role hierarchy.
2003
Typically data integration systems have significant gaps of coverage over the global (or mediated) schema they purport to cover. Given this reality, users are interested in knowing exactly which part of their query is supported by the available data sources. This report introduces a set of assumptions which enable users to obtain intensional descriptions of the certain, uncertain and missing answers to their queries given the available data sources. The general assumption is that query and source descriptions are written as tuple relational queries which return only whole schema tuples as answers. More specifically, queries and source descriptions must be within an identified sub-class of these ‘schema tuple queries’ which is closed over syntactic query difference. Because this identified query class is decidable for satisfiability, query containment and equivalence are also decidable. Sidestepping the schema tuple query assumption, the identified query class is more expressive than...
Conceptual Modeling-ER 2007, 2007
Schema mappings come in different flavors: simple correspondences are produced by schema matchers, intensional mappings are used for schema integration. However, the execution of mappings requires a formalization based on the extensional semantics of models. This problem is aggravated if multiple metamodels are involved. In this paper we present extensional mappings, that are based on second order tuple generating dependencies, between models in our Generic Role-based Metamodel GeRoMe. By using a generic metamodel, our mappings support data translation between heterogeneous metamodels. Our mapping representation provides grouping functionalities that allow for complete restructuring of data, which is necessary for handling nested data structures such as XML and object oriented models. Furthermore, we present an algorithm for mapping composition and optimization of the composition result. To verify the genericness, correctness, and composability of our approach we implemented a data translation tool and mapping export for several data manipulation languages.
Data & Knowledge Engineering, 2009
In this article we present extensional mappings, that are based on second order tuple generating dependencies between models in our Generic Role-based Metamodel GeRoMe. Our mappings support data translation between heterogeneous models, such as XML Schemas, relational schemas, or OWL ontologies. The mapping language provides grouping functionalities that allow for complete restructuring of data, which is necessary for handling object oriented models and nested data structures such as XML. Furthermore, we present algorithms for mapping composition and optimization of the composition result. To verify the genericness, correctness, and composability of our approach we implemented a data translation tool and mapping export for several data manipulation languages. Furthermore, we address the question how generic schema mappings can be harnessed for answering queries against an integrated global schema.
1995
We present a formal framework for the combination of schemas. A main problem addressed is that of determining when two schemas can be meaningfully integrated. Another problem is how to merge two schemas into an integrated schema that has the same information capacity as the original ones, ie, that the resulting schema can represent as much information as the original schemas. We show that both these problems can be solved by placing a restriction on the schemas to be integrated.
The Semantic Web: Research and …, 2005
One of the key challenges in the development of open semantic-based systems is enabling the exchange of meaningful information across applications which may use autonomously developed schemata. One of the typical solutions for that problem is the definition of a mapping between pairs of schemas, namely a set of point-to-point relations between the elements of different schemas. A lot of (semi-)automatic methods for generating such mappings have been proposed. In this paper we provide a preliminary investigation on the notion of correctness for schema matching methods. In particular we define different notions of soundness, strictly depending on what dimension (syntactic, semantic, pragmatic) of the language the mappings are defined on. Finally, we discuss some preliminary conditions under which a two different notions of soundness (semantic and pragmatic) can be related.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.