Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2006, Database TheoryICDT 2007
We study containment of conjunctive queries that are evaluated over databases that may contain tuples with null values. We assume the semantics of SQL for single block queries with a SELECT DISTINCT clause. This problem ("null containment" for short) is different from containment over databases without null values and sometimes more difficult. We show that null-containment for boolean conjunctive queries is NPcomplete while it is Π P 2 -complete for queries with distinguished variables. However, if no relation symbol is allowed to appear more than twice, then null-containment is polynomial, as it is for databases without nulls. If we add a unary test predicate IS NULL, as it is available in SQL, then containment becomes Π P 2 -hard for boolean queries, while it remains in Π P 2 for arbitrary queries.
Lecture Notes in Computer Science, 2006
Query containment is a fundamental problem of databases. Given two queries q1 and q2, it asks whether the set of answers to q1 is included in the set of answers to q 2 for any database. In this paper, we investigate this problem for conjunctive queries with negated subgoals. We use graph homomorphism as the core notion, which leads us to extend the results presented in [Ull97] and [WL03]. First, we exhibit sufficient (but not necessary) conditions for query containment based on special subgraphs of q2, which generalize that proposed in . As a corollary, we obtain a case where the time complexity of the problem decreases. From a practical viewpoint, these properties can be exploited in algorithms, as shown in the paper. Second, we propose an algorithm based on the exploration of a space of graphs, which improves existing algorithms.
Lecture Notes in Computer Science, 2010
We consider the containment problem for conjunctive queries with atomic negation. Firstly, we refine an existing algorithm based on homomorphism checks, which itself improves other known algorithms in databases, and analyze it experimentally. Secondly, we present a new algorithm based on the translation of the containment problem into the problem of checking the unsatisfiability of a propositional logical formula, which allows us to use a SAT solver, and we experimentally compare both algorithms.
Information Processing Letters, 1996
2003
In this paper we study the following problem: how to test whether É ¾ is contained in É ½ , where É ½ and É ¾ are conjunctive queries with arithmetic comparisons? This problem is fundamental in a large variety of database applications. Existing algorithms first normalize the queries, then test a logical implication using multiple containment mappings from É ½ to É ¾ . We are interested in cases where the containment can be tested more efficiently. This work is mainly motivated by (1) reducing the problem complexity from ¥ È ¾ -completeness to NP-completeness in these cases; and (2) utilizing the advantages of the homomorphism property (i.e., the containment test is based on a single containment mapping), in applications such as those of answering queries using views. The following are our results. We show several cases where the normalization step is not needed, thus reducing the size of the queries and, more importantly, reducing the number of containment mappings.
A relational database is said to be uncertain if primary key constraints can possibly be violated. A repair (or possible world) of an uncertain database is obtained by selecting a maximal number of tuples without ever selecting two distinct tuples with the same primary key value. For any Boolean query q, CERTAINTY(q) is the problem that takes an uncertain database db on input, and asks whether q is true in every repair of db. The complexity of this problem has been particularly studied for q ranging over the class of self-join-free Boolean conjunctive queries. A research challenge is to determine, given q, whether CERTAINTY(q) belongs to complexity classes FO, P, or coNP-complete. In this paper, we combine existing techniques for studying the above complexity classification task. We show that for any self-join-free Boolean conjunctive query q, it can be decided whether or not CERTAINTY(q) is in FO. Further, for any self-join-free Boolean conjunctive query q, CERTAINTY(q) is either in P or coNP-complete, and the complexity dichotomy is effective. This settles a research question that has been open for ten years, since [9].
Lecture Notes in Computer Science, 1998
In this paper, we consider conjunctive queries with built-in predicates of the form X < Y , X ≤ Y , X = Y , or X = Y , where X and Y are variables or constants from a totally ordered domain. We present a sufficient and necessary condition to test containment among these kinds of queries. Klug [8] left the problem open for the case when the domain is nondense, like the integers. Ullman [11] gave only a sufficient condition for the containment of conjunctive queries with built-in predicates and integer variables. Our test is based in a method that uses a new idea: the representation of an infinite number of databases by a finite set of, what we call, canonical databases, that use variables that denote uninterpreted constants.
IEEE Transactions on Knowledge and Data Engineering, 2000
AbstractÐIt is striking that the optimization of disjunctive queriesÐi.e., those which contain at least one or-connective in the query predicateÐhas been vastly neglected in the literature, as well as in commercial systems. In this paper, we propose a novel technique, called bypass processing, for evaluating such disjunctive queries. The bypass processing technique is based on new selection and join operators that produce two output streams: the true-stream with tuples satisfying the selection (join) predicate and the false-stream with tuples not satisfying the corresponding predicate. Splitting the tuple streams in this way enables us to ªbypassº costly predicates whenever the ªfateº of the corresponding tuple (stream) can be determined without evaluating this predicate. In the paper, we show how to systematically generate bypass evaluation plans utilizing a bottom-up building block approach. We show that our evaluation technique allows to incorporate the standard SQL semantics of null values. For this, we devise two different approaches: One is based on explicitly incorporating three-valued logic into the evaluation plans; the other one relies on two-valued logic by ªmovingº all negations to atomic conditions of the selection predicate. We describe how to extend an iterator-based query engine to support bypass evaluation with little extra overhead. This query engine was used to quantitatively evaluate the bypass evaluation plans against the traditional evaluation techniques utilizing a CNF-or DNF-based query predicate.
2012
We study the complexity of computing a query on a probabilistic database. We consider unions of conjunctive queries, UCQ, which are equivalent to positive, existential First Order Logic sentences, and also to non-recursive datalog programs. The tuples in the database are independent random events. We prove the following dichotomy theorem. For every UCQ query, either its probability can be computed in polynomial time in the size of the database, or is #P-hard. Our result also has applications to the problem of computing the probability of positive, Boolean expressions, and establishes a dichotomy for such classes based on their structure. For the tractable case, we give a very simple algorithm that alternates between two steps: applying the inclusion/exclusion formula, and removing one existential variable. A key and novel feature of this algorithm is that it avoids computing terms that cancel out in the inclusion/exclusion formula, in other words it only computes those terms whose Mobius function in an appropriate lattice is non-zero. We show that this simple feature is a key ingredient needed to ensure completeness. For the hardness proof, we give a reduction from the counting problem for positive, partitioned 2CNF, which is known to be #P-complete. The hardness proof is non-trivial, and combines techniques from logic, classical algebra, and analysis.
2010
A natural way for capturing uncertainty in the relational data model is by allowing relations that violate their primary key. A repair of such relation is obtained by selecting a maximal number of tuples without ever selecting two tuples that agree on their primary key. Given a Boolean query q, CERTAINTY(q) is the problem that takes as input a relational database and asks whether q evaluates to true on every repair of that database. In recent years, CERTAINTY(q) has been studied primarily for conjunctive queries. Conditions have been determined under which CERTAINTY(q) is coNP-complete, first-order expressible, or not first-order expressible. A remaining open question was whether there exist conjunctive queries q without self-join such that CERTAINTY(q) is in PTIME but not first-order expressible. We answer this question affirmatively.
1999
We present a new method that checks Query Containment for queries with negated derived atoms and/or integrity constraints. Existing methods for Query Containment checking that deal with these cases do not check actually containment but another related property called uniform containment, which is a sufficient but not necessary condition for containment. Our method can be seen as an extension of the canonical databases approach beyond the class of conjunctive queries.
Lecture Notes in Computer Science, 2004
In this paper we study the following problem: how to test whether É ¾ is contained in É ½ , where É ½ and É ¾ are conjunctive queries with arithmetic comparisons? This problem is fundamental in a large variety of database applications. Existing algorithms first normalize the queries, then test a logical implication using multiple containment mappings from É ½ to É ¾ . We are interested in cases where the containment can be tested more efficiently. This work is mainly motivated by (1) reducing the problem complexity from ¥ È ¾ -completeness to NP-completeness in these cases; and (2) utilizing the advantages of the homomorphism property (i.e., the containment test is based on a single containment mapping), in applications such as those of answering queries using views. The following are our results. We show several cases where the normalization step is not needed, thus reducing the size of the queries and, more importantly, reducing the number of containment mappings.
Journal of the ACM, 2001
This paper deals with the evaluation of acyclic Boolean conjunctive queries in relational databases. By well-known results of Yannakakis [1981], this problem is solvable in polynomial time; its precise complexity, however, has not been pinpointed so far. We show that the problem of evaluating acyclic Boolean conjunctive queries is complete for LOGCFL, the class of decision problems that are logspace-reducible to a context-free language. Since LOGCFL is contained in AC 1 and NC 2 , the evaluation problem of acyclic Boolean conjunctive queries is highly parallelizable. We present a parallel database algorithm solving this problem with a logarithmic number of parallel join operations. The algorithm is generalized to computing the output of relevant classes of non-Boolean queries. We also show that the acyclic versions of the following well-known database and AI problems are all LOGCFL-complete: The Query Output Tuple problem for conjunctive queries, Conjunctive Query Containment, Clause Subsumption, and Constraint Satisfaction. The LOGCFL-completeness result is extended to the class of queries of bounded treewidth and to other relevant query classes which are more general than the acyclic queries.
Lecture Notes in Computer Science, 2005
Kolaitis and Vardi pointed out that constraint satisfaction and conjunctive query containment are essentially the same problem. We study the Boolean conjunctive queries under a more detailed scope, where we investigate their counting problem by means of the algebraic approach through Galois theory, taking advantage of Post's lattice. We prove a trichotomy theorem for the generalized conjunctive query counting problem, showing this way that, contrary to the corresponding decision problems, constraint satisfaction and conjunctive-query containment differ for other computational goals. We also study the audit problem for conjunctive queries asking whether there exists a frozen variable in a given query. This problem is important in databases supporting statistical queries. We derive a dichotomy theorem for this audit problem that sheds more light on audit applicability within database systems.
Acta Informatica, 2002
In this paper, we present a general procedure to test conjunctive query containment. We divide the containment problem into four categories, taking into account the underlying semantics (set or bag theoretic) and the presence or absence of built-in predicates in the queries.
1977
We define the class of conjunctive queries in relational data bases, and the generalized join operator on relations.
Proceedings of the eighth ACM SIGACT-SIGMOD-SIGART symposium on Principles of database systems - PODS '89, 1989
This paper presents an algorithm that decides whether two conjunctive query expressions always describe disjoint sets of tuples. The decision procedure solves an open problem identified by Blakeley, Coburn, and Larson: how to check whether an explicitly stored view relation must be recomputed after an update, taking into account functional dependencies. For nonconjunctive queries, the disjointness problem is AfP-hard. For conjunctive queries, the time complexity of the algorithm given cannot be improved unless the reachability problem for directed graphs can be solved in sublinear time. The algorithm is novel in that it combines sep arate decision procedures for the theory of functional dependencies and for the theory of dense orders. Also, it uses tableaux that are capable of representing all six comparison operators <, 5, =, 2, >, and #.
Proceedings of the 34th ACM Symposium on Principles of Database Systems - PODS '15, 2015
A relational database is said to be uncertain if primary key constraints can possibly be violated. A repair (or possible world) of an uncertain database is obtained by selecting a maximal number of tuples without ever selecting two distinct tuples with the same primary key value. For any Boolean query q, CERTAINTY(q) is the problem that takes an uncertain database db as input, and asks whether q is true in every repair of db. The complexity of this problem has been particularly studied for q ranging over the class of self-join-free Boolean conjunctive queries. A research challenge is to determine, given q, whether CERTAINTY(q) belongs to complexity classes FO, P, or coNP-complete. In this paper, we combine existing techniques for studying the above complexity classification task. We show that for any self-join-free Boolean conjunctive query q, it can be decided whether or not CERTAINTY(q) is in FO. Further, for any self-join-free Boolean conjunctive query q, CERTAINTY(q) is either in P or coNP-complete, and the complexity dichotomy is effective. This settles a research question that has been open for ten years.
Proceedings of the 33rd ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, 2014
While checking containment of Datalog programs is undecidable, checking whether a Datalog program is contained in a union of conjunctive queries (UCQ), in the context of relational databases, or a union of conjunctive 2-way regular path queries (UC2RPQ), in the context of graph databases, is decidable. The complexity of these problems is, however, prohibitive: 2EXPTIME-complete. We investigate to which extent restrictions on UCQs and UC2RPQs, which have been known to reduce the complexity of query containment for these classes, yield a more "manageable" single-exponential time bound, which is the norm for several static analysis and verification tasks. Checking containment of a UCQ Θ ′ in a UCQ Θ is NP-hard, in general, but better bounds can be obtained if Θ is restricted to belong to a tractable class of UCQs, e.g., a class of bounded treewidth or hypertreewidth. Also, each Datalog program Π is equivalent to an infinite union of CQs. This motivated us to study the question of whether restricting Θ to belong to a tractable class also helps alleviate the complexity of checking whether Π is contained in Θ. We study such question in detail and show that the situation is much more delicate than expected: First, tractability of UCQs does not help in general, but further restricting Θ to be acyclic and have a bounded number of shared variables between atoms yields better complexity bounds. As corollaries, we obtain that checking containment of Π in Θ is in EXPTIME if Θ is of treewidth one, or it is acyclic and the arity of the schema is fixed. In the case of UC2RPQs we show an EXPTIME bound when queries are acyclic and have a bounded number of edges connecting pairs of variables. As a corollary, we obtain that checking whether Π is contained in UC2RPQ Γ is in EXPTIME if Γ is a strongly acyclic UC2RPQ. Our positive results for UCQs and UC2RPQs are optimal, in a sense, since slightly extending the conditions turns the problem 2EXPTIME-complete.
Lecture Notes in Computer Science, 2011
Ontology-based data access (OBDA) is widely accepted as an important ingredient of the new generation of information systems. In the OBDA paradigm, potentially incomplete relational data is enriched by means of ontologies, representing intensional knowledge of the application domain. We consider the problem of conjunctive query answering in OBDA. Certain ontology languages have been identified as FO-rewritable (e.g., DL-Lite and sticky-join sets of TGDs), which means that the ontology can be incorporated into the user's query, thus reducing OBDA to standard relational query evaluation. However, all known query rewriting techniques produce queries that are exponentially large in the size of the user's query, which can be a serious issue for standard relational database engines. In this paper, we present a polynomial query rewriting for conjunctive queries under unary inclusion dependencies. On the other hand, we show that binary inclusion dependencies do not admit polynomial query rewriting algorithms.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.