Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
XML is helping to advance the notion of using the web as a large database. XML can help view, store, manipulate, and transfer semi-structured data that exist in files-often webpages. XML is being developed in part because of the recognized need for a less ad hoc method of handling data than HTML allows. In support of the idea that the web is to be treated like a database, we introduce constraints to XML. Constraints allow for more specific data definitions then XML currently supports, and these definitions may span across multiple webpages across the web. We also define more data types, so that declaring constraints can be more easily facilitated.
2000
We present a constraint-based extension of XML for specifying the structure and semantic coherence of websites and their data. This extension is motivated by the fact that many websites, especially organizational websites and corporate intranets, largely contain structured information. Such websites can be regarded as databases. XML can help view, store, manipulate, and transfer semi-structured data that exists in files-often webpages-and facilitates a less ad hoc method of handling data than HTML allows. In support of the view that a website is a database, we introduce CobWeb, a constraint-based extension of XML. CobWeb allows developers to express the concept of the semantic integrity of a database. Constraints may be used in a document type definition (DTD) in order to place restrictions on the values of both elements and attributes in the DTD. These constraints can effectively govern the contents of otherwise disparate webpages in a website, thereby ensuring the both the structured and the semantic integrity of the site as a whole. We provide unary (or domain) constraints, binary constraints (including various comparison operations), as well as aggregation constraints (sum, average, etc.). We also define data types not included in the XML 1.0 specification, so that declaring constraints can be more easily facilitated. By taking advantage of XML's modular design, we can create a parser that works in conjunction with and extends existing XML parsers. This will allow developers to continue using their own parsers and while taking advantage of CobWeb's constraint features.
Proceedings of the twelfth international conference on World Wide Web - WWW '03, 2003
Although originally designed for large-scale electronic publishing, XML plays an increasingly important role in the exchange of data on the Web. In fact, it is expected that XML will become the lingua franca of the Web, eventually replacing HTML. Not surprisingly, there has been a great deal of interest on XML both in industry and in academia. Nevertheless, to date no comprehensive study on the XML Web (i.e., the subset of the Web made of XML documents only) nor on its contents has been made. This paper is the first attempt at describing the XML Web and the documents contained in it. Our results are drawn from a sample of a repository of the publicly available XML documents on the Web, consisting of about 200,000 documents. Our results show that, despite its short history, XML already permeates the Web, both in terms of generic domains and geographically. Also, our results about the contents of the XML Web provide valuable input for the design of algorithms, tools and systems that use XML in one form or another.
While we can take as a fact " the Web changes everything " , we argue that " XML is the means " for such a change to make a significant step forward. We therefore regard XML-related research as the most promising and challenging direction for the community of database researchers. In this paper, we approach XML-related research by taking three progressive perspectives. We first consider XML as a data representation standard (in the small), then as a data interchange standard (in the large), and finally as a basis for building a new repository technology. After a broad and necessarily coarse-grain analysis, we turn our focus to three specific research projects which are currently ongoing at the Po-litecnico di Milano, concerned with XML query languages, with active document management, and with XML-based specifications of Web sites.
ACM SIGMOD Record, 2004
The EDBT'04 Workshop on D ā tabase T echnologies for Hā ndling X ML Information on the Web (DataX'04 ) was held in Heraklion, Crete, on Sunday 14 March, 2004, and attracted approximately 30 participants from di¤erent countries.
Extensible Markup Language (XML) is a meta-language for defining new languages. Its impact on the modern and emerging web technologies has been (and will be) incredible and it has represented the foundation of a multitude of applications. This chapter is devoted to the presentation of XML and its applications. It provides an introduction to this wide topic, covering the principal arguments and providing references and examples
Designed for librarians and library staff, this workshop introduces participants to the extensible markup language (XML) through numerous library examples, demonstrations, and structured hands-on exercises. Through this process you will be able to evaluate the uses of XML for making your library's data and information more accessible to people as well as computers. Examples include adding value to electronic texts, creating archival finding aids, and implementing standards compliant Web pages.
2001
MSL (Model Schema Language) is an attempt to formalize some of the core idea in XML Schema. The benefits of a formal description is that it is both concise and precise. MSL has already proved helpful in work on the design of XML Query. We expect that similar techniques can be used to extend MSL to include most or all of XML Schema.
Logics for Emerging Applications of Databases, 2004
Lecture Notes in Computer Science, 2002
Bulletin of the Technical Committee on, 2001
Sigmod Record, 2009
XML has conquered its place as the most used standard for representing Web data. An XML schema may be employed for similar purposes of those from database schemas. There are different languages to write an XML schema, such as DTD and XSD. In this paper, we provide a general view, an X-Ray, on Web-available XSD files by identifying which XSD constructs are more and less frequently used. Furthermore, we provide an evolution perspective, showing results from XSD files collected in 2005 and 2008. Hence, we can also draw some conclusions on what trends seem to exist in XSD usage. The results of such study provide relevant information for developers of XML applications, tools and algorithms in which the schema has a distinguished role.
Proceedings of the International Conference on …, 2000
The amount of useful semi-structured data on the web continues to grow at a stunning pace. Often interesting web data are not in database systems but in HTML pages, XML pages, or text les. Data in these formats is not directly usable by standard SQL-like query processing ...
Proceedings of the 2006 ACM symposium on …, 2006
StandardView, 1998
2004
Abstract In the past few years, a number of constraint languages for XML documents has been proposed. They are cumulatively called schema languages or validation languages and they comprise, among others, DTD, XML Schema, RELAX NG, Schematron, DSD, xlinkit. One major point of discrimination among schema languages is the support of co-constraints, or co-occurrence constraints, eg, requiring that attribute A is present if and only if attribute B is (or is not) presentin the same element.
2007
XML Schema supports the specification of occurrence constraints by declaring values for its min/max Occurs attributes. These constraints are structural in the sense that they restrict the cardinality of children elements independently of the data that is present in their subtrees. Thus, occurrence constraints do not address any data semantics and are therefore only of limited interest to data modelling.
Dke, 2007
XML has evolved as the new standard for the representation and exchange of both structured and semistructured data. XML's ability to succinctly describe complex information can also be used for specifying application meta-data. XML's popularity is evident from its use in a wide spectrum of application domains: from document publication, to computational chemistry, health care and life sciences, multimedia and ecommerce. Increasing popularity of web-based business and the emergence of web services that use XML-based descriptions in WSDL and exchange XML messages with SOAP have led to further acceptance of XML. The purpose of the second XML Data and Schema Management Workshop was to provide a forum for the exchange of ideas and experiences among the theoreticians and practitioners who are involved in design, management, and implementation of XML data management systems. It was held on 3 April, 2005 in conjunction with the 21st IEEE International Conference on Data Engineering (ICDE 2005). The workshop saw a lengthy full-day program that included 12 regular papers divided into three sessions. Many interesting discussions sparked off during the workshop among the presenters and the audience in almost any XML data management topics. This special issue of Data and Knowledge Engineering features four papers selected from the XSDM 2005 workshop based on their merits and relevance. Each of these papers is an extended and revised version of the original workshop paper and has gone through a rigorous reviewing process before being accepted for inclusion in this special issue. In the first paper entitled ''QMatch-A Hybrid Match Algorithm for XML Schemas'', Tansalarak and Claypool propose a new hybrid schema match algorithm, QMatch, that provides a unique path-based framework for harnessing traditional structural and semantic information, while exploiting the constraints inherent in XML documents such as the order of XML elements, to provide improved levels of matching between two given XML schemata. QMatch is based on the measurement of a unique quality of match metric, QoM, and a set of classifiers. They experimentally demonstrated the benefits of QMatch over existing algorithms such as Cupid. Efficient processing of XML queries is a key area of research in XML data management. The next two papers thus focus of efficient XML query evaluation. Chen et al. in their paper titled ''Index Structures for Matching XML Twigs Using Relational Query Processors'' address some of the limitations of existing XML path indices. They present a framework defining a family of index structures that includes most existing XML path indices. They also propose two novel index structures with different space-time tradeoffs effective for the evaluation of XML twig queries with value conditions. They also show how this family of index structures can be tightly integrated with a relational query processor. The next paper entitled ''Optimization of Nested XQuery Expressions with Orderby Clauses'' by Wang, Rundensteiner, and Mani, presents a technique for XQuery optimization. They propose an algebraic rewriting technique of nested XQuery expressions containing explicit ORDERBY clauses. Their technique is based on
CLEI Electronic Journal, 2018
After being able to mark-up text and validate its structure according to a document type specification, we may start thinking it would be natural to be able to validate some non- structural issues in the documents. This paper is to formally discuss semantic-related aspects. In that context, we introduce a domain specific language developed for such a purpose: XCSL. XCSL is not just a language, it is also a processing model. Furthermore, we discuss the general philosophy underlying the proposed approach, presenting the architecture of our semantic vali- dation system, and we detail the respective processor. To illustrate the use of XCSL language and the subsequent processing, we present two case-studies. Nowadays, we can find some other languages to restrict XML documents to those semantically valid — namely Schematron and XML-Schema. So, before concluding the paper, we compare XCSL to those approaches.
International Journal of Advanced Corporate Learning (iJAC), 2010
Database normalization is a process which eliminates redundancy, organizes data efficiently and improves data consistency. Functional, multivalued, and join dependencies (FDs, MVDs, and JDs) play fundamental roles in relational databases where they provide semantics for the data and at the same time are the foundations for database design. In this study we investigate the issue of defining functional, multivalued and join dependencies and their normal forms in XML database model. We show that, like relational databases, XML documents may contain redundant information, and this redundancy may cause update anomalies. Furthermore, such problems are caused by certain dependencies among paths in the document. Our goal is to find a way for converting an arbitrary XML Schema to a well-designed one that avoids these problems. We extend the notion of tuple for relational databases to the XML model. We show that an XML tree can be represented as a set of tree tuples. We introduce the definiti...
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.