Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1997, Proceedings of the …
AI
The paper discusses the implementation of a view mechanism for semistructured data within the Lore database management system, highlighting its potential to enhance data adaptability for users and applications. It identifies unique motivations for employing views, such as enabling logical structuring of disparate data and creating standalone databases, while also addressing challenges like the absence of schemas and the integration of queries with objects. The work explores view specifications and query processing strategies, alongside future considerations for richer view definitions.
Proceedings of the 1999 ACM …, 1999
2002
The basic querying mechanism over semistructured data, namely regular path queries, asks for all pairs of objects that are connected by a path conforming to a regular expression. We consider conjunctive two-way regular path queries (C2RPQ c's), which extend regular path queries with two features. First, they add the inverse operator, which allows for expressing navigations in the database that traverse the edges both backward and forward.
The most promising and dominant data format for data processing and representation on the Internet is the semistructured data form termed XML. XML data has no fixed schema; it evolved and is self describing which results in management difficulties compared to, for example, relational data. It is therefore a major challenge for the database community to design query processing techniques and storage methods that can retrieve semistructured data efficiently. In this paper, we present a querying scheme for semistructured data views of relational form. The proposed technique stores element-paths, attributes, contents of the element paths and attributes, and XML processing instructions in a dynamic relational structure termed as Multi-XML-Data-Structure (MXDS).
Semi-structured data is defined as irregular data with structure that may change rapidly or unpredictably. An example of such data can be found inside the World-Wide Web. Since the data is irregular, the user may not know the complete structure of the database. Thus, querying such data becomes a difficult issue. In order to write meaningful queries on semistructured data, there is a need for a query language that will support the features that are presented by this data. Standard query languages, such as SQL for relational databases and OQL for object databases, are too constraining for querying semi-structured data, because they require data to conform to a fixed schema before any data is stored into the database. This paper introduces Lorel, a query language developed particularly for querying semistructured data. Furthermore, it investigates if the standard query languages support any of the criteria presented for semi-structured data. The result is an evaluation of three query languages, SQL, OQL and Lorel against these criteria.
Lecture Notes in Computer Science, 1998
Several researchers have considered integrating multiple unstructured, semi-structured, and structured data sources by modeling all sources as edge labeled graphs. Data in this model is selfdescribing and dynamically typed, and captures both schema and data information. The labels are arbitrary atomic values, such as strings, integers, reals, etc., and the integrated data graph is stored in a unique data repository, as a relation of edges. The relation is dynamically typed, i.e. each edge label is tagged with its type. Although the unique, labeled graph repository is exible, it looses all static type information, and results in severe e ciency penalties compared to querying structured databases, such as relational or object-oriented databases. In this paper we propose an alternative method of storing and querying semi-structured data, using storage schemas, which are closely related to recently introduced graph schemas BDFS97]. A storage schema splits the graph's edges into several relations, some of which may have labels of known types (such as strings or integers) while others may be still dynamically typed. We show here that all positive queries in UnQL, a query language for semistructured data, can be translated into conjunctive queries against the relations in the storage schema. This result may be surprising, because UnQL is a powerful language, featuring regular path expressions, restructuring queries, joins, and unions. We use this technique in order to translate queries on the integrated, semi-structured data into queries on the external sources. In this setting the integrated semi-structured data is not materialized but virtual and the problem is to translate a query against the integrated view, possibly involving regular path expressions and restructuring, into queries which can be answered by the external sources. Here we use again the storage schema in order to split the graph into relations according to their sources. Any positive UnQL query is decomposed based on these relations and translated into queries on the external sources.
PROCEEDINGS OF THE INTERNATIONAL CONFERENCE ON VERY LARGE DATA BASES, 1999
Motivated to a large extent by the substantial and growing prominence of the World-Wide Web and the potential benefits that may be obtained by applying database concepts and techniques to web data management, new data models and query languages have emerged that contend with the semistructured nature of web data. These models organize data in graphs. The nodes in a graph denote objects or values, and each edge is labeled with a single word or phrase. Nodes are described by the labels of the paths that lead to ...
International journal on …, 1997
Abstract. We present the Lorel language, designed for querying semistructured data. Semistructured data is becoming more and more prevalent, eg, in structured documents such as HTML and when performing simple integration of data from multiple sources. Traditional data models and ...
The most promising and dominant data format for data processing and representing on the Internet is the Semistructured data form termed XML. XML data has no fixed schema; it evolved and is self describing which results in management difficulties compared to, for example relational data. XML queries differ from relational queries in that the former are expressed as path expressions. The efficient handling of structural relationships has become a key factor in XML query processing. It is therefore a major challenge for the database community to design query processing techniques and storage methods that can manage semistructured data efficiently. The main contribution of this paper is querying semistructured data using bitmap to represent path-value relationship and compress the bitmap to save space. The presented bitmap indexing and querying scheme termed BIQS data that stores the element path, token of the word, attribute and document number in a dynamically created matrix structure. We use word, attribute and path dictionaries for the construction of a Bitmap structure. This paper describes an algorithm to query semistructured data in a more time efficient way than is provided by other relational and semistructured query processing techniques. The presented BIQS structure provides storage and query performance improvement due to the compression of semistructured data.
Motivated to a large extent by the substantial and growing prominence of the World-Wide Web and the potential benefits that may be obtained by applying database concepts and techniques to web data management, new data models and query languages have emerged that contend with web data. These models organize data in graphs where nodes denote objects or values and edges are labeled with single words or phrases. Nodes are described by the labels of the paths that lead to them, and these descriptions serve as the basis for querying. This paper proposes an extensible framework for capturing and querying meta-data properties in a semistructured data model. Properties such as temporal aspects of data, prices associated with data access, quality ratings associated with the data, and access restrictions on the data are considered. Specifically, the paper defines an extensible data model and an accompanying query language that provides new facilities for matching, slicing, collapsing, and coalescing properties. It also briefly introduces an implemented, SQLlike query language for the extended data model that includes additional constructs for the effective querying of graphs with properties.
Proc. 5th Int. Conf. on Deductive and …, 1997
We show how the mechanisms of an active/deductive object-oriented model are capable of modeling the information available on the World Wide Web, the paradigmatic example of a large, distributed collection of poorly structured, heterogeneous information, consisting of documents connected by hypertext links. We exhibit the basic features of the object model, including the schema de nition language, the query language with multiple roles, the basic update operations, and a form of active rules, and a compilation of the mentioned features into a fragment of LDL++, which is essentially Datalog extended with non-determinism and a form of strati ed negation. The proposed compilation has a twofold aim. On one side, it should provide an abstract implementation level, where the object model is declaratively reconstructed and its semantics clari ed. On the other side, the proposed compilation should form the basis of a realistic implementation, as LDL++ can be e ciently executed, and supports real side e ects.
ACM Transactions on …, 2005
Proceedings of the 2002 …, 2002
lics, 2000
View-based query processing requires to answer a query posed to a database only on the basis of the information on a set of views, which are again queries over the same database. This problem is relevant in many aspects of database management, and has been addressed by means of two basic approaches, namely, query rewriting and query answering. In the former approach, one tries to compute a rewriting of the query in terms of the views, whereas in the latter, one aims at directly answering the query based on the view extensions. We study view-based query processing for the case of regular-path queries, which are the basic querying mechanisms for the emergent field of semistructured data. Based on recent results, we first show that a rewriting is in general a co-NP function wrt to the size of view extensions. Hence, the problem arises of characterizing which instances of the problem admit a rewriting that is PTIME. A second contribution of the work is to establish a tight connection between view-based query answering and constraint-satisfaction problems, which allows us to show that the above characterization is going to be difficult. As a third contribution of our work, we present two methods for computing PTIME rewritings of specific forms. The first method, which is based on the established connection with constraint-satisfaction problems, gives us rewritings expressed in Datalog with a fixed number of variables. The second method, based on automata-theoretic techniques, gives us rewritings that are formulated as unions of conjunctive regular-path queries with a fixed number of variables.
2007
As a result of the extensive research in view-based query processing, three notions have been identified as fundamental, namely rewriting, answering, and losslessness. Answering amounts to computing the tuples satisfying the query in all databases consistent with the views. Rewriting consists in first reformulating the query in terms of the views and then evaluating the rewriting over the view extensions. Losslessness holds if we can answer the query by solely relying on the content of the views.
1999
ABSTRACT. We extend the model for semi-structured data proposed in [BUN 97], where both databases and schemas are represented as graphs, with the possibility of expressing different types of constraints on the nodes of the graphs. We discuss how the expressive power of the constraint language may influence the complexity of checking subsumption between schemas, and devise a polynomial algorithm for an interesting class of constraints.
1999
Some aggregate and grouping queries are conceptually simple, but difficult to express in SQL. This difficulty causes both conceptual and implementation problems for the SQLbased database system. Complicated queries and views are hard to understand and maintain. Further, the code produced is sometimes unnecessarily inefficient, as we demonstrate experimentally using a commercial database system. In this paper, we examine a class of queries involving (potentially repeated) selection, grouping and aggregation over the same groups, and propose an extension of SQL syntax that allows the succinct representation of these queries. We propose a new relational algebra operation that represents several levels of aggregation over the same groups in an operand relation. We demonstrate that the extended relational operator can be evaluated using efficient algorithms. We describe a translation from the extended SQL language into our algebraic language. We have implemented a preprocessor that evaluates our extended language on top of a commercial
XML queries are based on path expressions which are composed of some elements connected to each other in a tree pattern structure, called Query Tree Pattern (QTP). Thus, the core operation of XML query processing is finding all instances of QTP in the XML document. A number of methods are offered for QTP matching, but they process too many elements in XML document while most of them have no opportunity to participate in the final result. The exiting techniques have lots of limitations and disadvantages that are illustrated in detail in Chapter III. In this thesis, the author proposes a novel method which doesn't blindly processes elements of the document. The author abstracts structural relationships inside the XML documents to evaluate the XML queries. In contrast to the existing methods, in the proposed method only elements which have a chance to produce a result are processed and those which are definitely not part of any final result are ignored. An XML query is either a cha...
1999
We extend the model for semi-structured data proposed in [4], where both databases and schemas are represented as graphs, with the possibility of expressing different types of constraints on the nodes of the graphs, and defining queries which are used to select graphs from a database. We show that reasoning tasks at the basis of query optimization, such as schema subsumption, query-schema comparison, query containment, and query satisfiability, are decidable.
Journal of Computing Science and Engineering, 2007
Existing systems that support semistructured views do not maintain semantics during the process of designing the views. Thus, these systems do not guarantee the validity and reversibility of the views. In this paper, we propose an approach to address the issue of valid and reversible semistructured views. We design a set of view operators for designing semistructured views. These operators are select, drop, join and swap. For each operator, we develop a complete set of rules to maintain the semantics of the views. In particular, we maintain the evolution and integrity of relationships once an operator is applied. We also examine the reversible view problem under our operators and develop rules to guarantee that the designed views are reversible. Finally, we examine the changes in the participation constraints of relationship types during the view design process, and develop rules to ensure the correctness of the participation constraints.
1998
We address the problem of null values and other forms of semi-structur ed data in object-orient ed databases. Various aspects and issues concerning semi-structured data that are currently presented in the literature are discussed in the paper. We propose a new universal approach to semi-structured data based on the idea of absent objects. The idea covers null values and union types and can be smoothly combined with the idea of default values. We introduce a simple model of object store that is similar to the Tsimmis model. In contrast to the main stream of the research, our basic assumption is that semi-structured data are not only to be queried but also processed by an integrated query/programming language. To this end we discuss query language constructs that are relevant to query semi- structured data and corresponding issues in programming languages. The idea follows the stack-based approach to integrated query/programming languages that we have implemented in the LOQIS system. ...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.