Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2017
…
10 pages
1 file
Certain answers are a widely accepted semantics of query answering over incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on developing evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. In this paper, we show how novel evaluation algorithms with correctness guarantees can be developed leveraging conditional tables and the conditional evaluation of queries, while retaining polynomial time data complexity.
2017
Certain answers are a widely accepted semantics of query answering over incomplete databases. Since their computation is a coNP-hard problem, recent research has focused on developing evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. In this paper, we show how novel evaluation algorithms with correctness guarantees can be developed leveraging conditional tables and the conditional evaluation of queries, while retaining polynomial time data complexity.
Proceedings of the 22nd International Database Engineering & Applications Symposium, 2018
Incomplete information arises in many database applications, such as data integration, data exchange, inconsistency management, data cleaning, ontological reasoning, and many others. A principled way of answering queries over incomplete databases is to compute certain answers, which are query answers that can be obtained from every complete database represented by an incomplete one. For databases containing (labeled) nulls, certain answers to positive queries can be easily computed in polynomial time, but for more general queries with negation the problem becomes coNP-hard. To make query answering feasible in practice, one might resort to SQL's evaluation, but unfortunately, the way SQL behaves in the presence of nulls may result in wrong answers. Thus, on the one hand, SQL's evaluation is efficient but flawed, on the other hand, certain answers are a principled semantics but with high complexity. To deal with issue, recent research has focused on developing polynomial time ...
Information Systems, 2019
Certain answers are a widely accepted semantics of query answering over incomplete databases. As their computation is a coNP-hard problem, recent research has focused on developing (polynomial time) evaluation algorithms with correctness guarantees, that is, techniques computing a sound but possibly incomplete set of certain answers. The aim is to make the computation of certain answers feasible in practice, settling for under-approximations. In this paper, we present novel evaluation algorithms with correctness guarantees, which provide better approximations than current techniques, while retaining polynomial time data complexity. The central tools of our approach are conditional tables and the conditional evaluation of queries. We propose different strategies to evaluate conditions, leading to different approximation algorithms-more accurate evaluation strategies have higher running times, but they pay off with more certain answers being returned. Thus, our approach offers a suite of approximation algorithms enabling users to choose the technique that best meets their needs in terms of balance between efficiency and quality of the results.
2019
Many database applications face the problem of querying incomplete data. In such scenarios, certain answers are a principled semantics of query answering. Unfortunately, the computation of certain query answers is a coNP-hard problem. To make query answering feasible in practice, recent research has focused on developing polynomial time algorithms computing a sound (but possibly incomplete) set of certain answers. In this paper we present a system prototype implementing a suite of algorithms to compute sound sets of certain answers. The central tools used by our system are conditional tables and the conditional evaluation of relation algebra. Different evaluation strategies can be applied, with more accurate ones having higher complexity, but returning more certain answers, thereby enabling users to choose the technique that best meets their needs in terms of balance between efficiency and quality of the results.
Proceedings of the VLDB Endowment, 2013
We present a system that computes for a query that may be incomplete, complete approximations from above and from below. We assume a setting where queries are posed over a partially complete database, that is, a database that is generally incomplete, but is known to contain complete information about specific aspects of its application domain. Which parts are complete, is described by a set of so-called table-completeness statements. Previous work led to a theoretical framework and an implementation that allowed one to determine whether in such a scenario a given conjunctive query is guaranteed to return a complete set of answers or not. With the present demonstrator we show how to reformulate the original query in such a way that answers are guaranteed to be complete. If there exists a more general complete query, there is a unique most specific one, which we find. If there exists a more specific complete query, there may even be infinitely many. In this case, we find the least specific specializations whose size is bounded by a threshold provided by the user. Generalizations are computed by a fixpoint iteration, employing an answer set programming engine. Specializations are found leveraging unification from logic programming.
ACM Transactions on Database Systems, 2014
The term naïve evaluation refers to evaluating queries over incomplete databases as if nulls were usual data values, i.e., to using the standard database query evaluation engine. Since the semantics of query answering over incomplete databases is that of certain answers, we would like to know when naïve evaluation computes them: i.e., when certain answers can be found without inventing new specialized algorithms. For relational databases it is well known that unions of conjunctive queries possess this desirable property, and results on preservation of formulae under homomorphisms tell us that within relational calculus, this class cannot be extended under the open-world assumption. Our goal here is twofold. First, we develop a general framework that allows us to determine, for a given semantics of incompleteness, classes of queries for which naïve evaluation computes certain answers. Second, we apply this approach to a variety of semantics, showing that for many classes of queries beyond unions of conjunctive queries, naïve evaluation makes perfect sense under assumptions different from open-world. Our key observations are: (1) naïve evaluation is equivalent to monotonicity of queries with respect to a semantics-induced ordering, and (2) for most reasonable semantics of incompleteness, such monotonicity is captured by preservation under various types of homomorphisms. Using these results we find classes of queries for which naïve evaluation works, e.g., positive first-order formulae for the closed-world semantics. Even more, we introduce a general relation-based framework for defining semantics of incompleteness, show how it can be used to capture many known semantics and to introduce new ones, and describe classes of first-order queries for which naïve evaluation works under such semantics.
2018
Consistent query answering is a principled approach for querying inconsistent knowledge bases. It relies on the notion of a repair, that is, a maximal consistent subset of the facts in the knowledge base. One drawback of this approach is that entire facts are deleted to resolve inconsistency, even if they may still contain useful “reliable” information. To overcome this limitation, we propose a new notion of repair allowing values within facts to be updated for restoring consistency. This more finegrained repair primitive allows us to preserve more information in the knowledge base. We also introduce the notion of a universal repair, which is a compact representation of all repairs. Then, we show that consistent query answering in our framework is intractable (coNP-complete). In light of this result, we develop a polynomial time approximation algorithm for computing a sound (but possibly incomplete) set of consistent query answers.
Proceedings of the 2015 ACM SIGMOD International Conference on Management of Data, 2015
In many applications including loosely coupled cloud databases, collaborative editing and network monitoring, data from multiple sources is regularly used for query answering. For reasons such as system failures, insufficient author knowledge or network issues, data may be temporarily unavailable or generally nonexistent. Hence, not all data needed for query answering may be available. In this paper, we propose a natural class of completeness patterns, expressed by selections on database tables, to specify complete parts of database tables. We then show how to adapt the operators of relational algebra so that they manipulate these completeness patterns to compute completeness patterns pertaining to query answers. Our proposed algebra is computationally sound and complete with respect to the information that the patterns provide. We show that stronger completeness patterns can be obtained by considering not only the schema but also the database instance and we extend the algebra to take into account this additional information. We develop novel techniques to efficiently implement the computation of completeness patterns on query answers and demonstrate their scalability on real data.
Proceedings of the VLDB Endowment, 2014
A database is called uncertain if two or more tuples of the same relation are allowed to agree on their primary key. Intuitively, such tuples act as alternatives for each other. A repair (or possible world) of such uncertain database is obtained by selecting a maximal number of tuples without ever selecting two tuples of the same relation that agree on their primary key. For a Boolean query q , the problem CERTAINTY( q ) takes as input an uncertain database db and asks whether q evaluates to true on every repair of db. In recent years, the complexity of CERTAINTY( q ) has been studied under different restrictions on q . These complexity studies have assumed no restrictions on the uncertain databases that are input to CERTAINTY( q ). In practice, however, it may be known that these input databases are partially consistent, in the sense that they satisfy some dependencies (e.g., functional dependencies). In this article, we introduce the problem CERTAINTY( q ) in the presence of a set...
SIAM Journal on Computing, 2014
When finding exact answers to a query over a large database is infeasible, it is natural to approximate the query by a more efficient one that comes from a class with good bounds on the complexity of query evaluation. In this paper we study such approximations for conjunctive queries. These queries are of special importance in databases, and we have a very good understanding of the classes that admit fast query evaluation, such as acyclic, or bounded (hyper)treewidth queries. We define approximations of a given query Q as queries from one of those classes that disagree with Q as little as possible. We concentrate on approximations that are guaranteed to return correct answers. We prove that for the above classes of tractable conjunctive queries, approximations always exist, and are at most polynomial in the size of the original query. This follows from general results we establish that relate closure properties of classes of conjunctive queries to the existence of approximations. We also show that in many cases, the size of approximations is bounded by the size of the query they approximate. We establish a number of results showing how combinatorial properties of queries affect properties of their approximations, study bounds on the number of approximations, as well as the complexity of finding and identifying approximations. The technical toolkit of the paper comes from the theory of graph homomorphisms, as we mainly work with tableaux of queries and characterize approximations via preorders based on the existence of homomorphisms. In particular, most of our results can be also interpreted as approximation or complexity results for directed graphs.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Proceedings of the 2011 Joint EDBT/ICDT Ph.D. Workshop on - PhD '11, 2011
Proceedings of the 21st ACM international conference on Information and knowledge management - CIKM '12, 2012
Proceedings of the 32nd symposium on Principles of database systems - PODS '13, 2013
Theory of Computing Systems
Proceedings of the 42nd ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems
Cornell University - arXiv, 2022
arXiv (Cornell University), 2023
2010
Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
arXiv (Cornell University), 2019
arXiv (Cornell University), 2023
Lecture Notes in Computer Science, 2004