Academia.eduAcademia.edu

Semistructured Data

539 papers
3,167 followers
AI Powered
Semistructured data refers to information that does not conform to a fixed schema but still contains organizational properties, such as tags or markers, to separate semantic elements. This type of data is often found in formats like XML or JSON, allowing for flexibility in data representation while maintaining some level of structure.
We present a Heterogenous Data Quality Methodology (HDQM) for Data Quality (DQ) assessment and improvement that considers all types of data managed in an organization, namely structured data represented in databases, semistructured data... more
We posit that a semistructured data model offers the right balance of rich structure and flexible (or lack of) schema allowing naive end users to record information in whatever form makes it easy for them to manage. We describe our... more
The ISO 9001 standard is adopted worldwide by organizations from different sectors. The ISO 9001:2015 guidelines for implementing the process approach require not only the identification of the organization processes, but also the... more
The International Journal of Software Engineering & Applications (IJSEA) is a bi-monthly open access peer-reviewed journal that publishes articles which contribute new results in all areas of the Software Engineering & Applications. The... more
We posit that a semistructured data model offers the right balance of rich structure and flexible (or lack of) schema allowing naive end users to record information in whatever form makes it easy for them to manage. We describe our... more
Because of the widespread diffusion of semistructured data in XML format, much research effort is currently devoted to support the storage and retrieval of large collections of such documents. XML documents can be compared as to their... more
This report starts by, in Chapter 1, outlining aspects of querying and updating resources on the Web and on the Semantic Web, including the development of query and update languages to be carried out within the Rewerse project.
We often use UML diagrams for our software development projects, and also for modeling XML DTDs and Schemas, finding that although UML diagrams can effectively be made to represent DTDs and Schemas (either using Class or Component... more
We present a Heterogenous Data Quality Methodology (HDQM) for Data Quality (DQ) assessment and improvement that considers all types of data managed in an organization, namely structured data represented in databases, semistructured data... more
A commercial Web page typically contains many information blocks. Apart from the main content blocks, it usually has such blocks as navigation panels, copyright and privacy notices, and advertisements (for business purposes and for easy... more
Recent IR extensions to XML query languages such as Xpath 1.0 Full-Text or the NEXI query language of the INEX benchmark series reflect the emerging interest in IR-style ranked retrieval over semistructured data. TopX is a top-k retrieval... more
Existing systems for managing and querying semistructured-data sources store the schema with the data. Lorel QRS+95] and Tsimmis PGMW95] store their data as graphs. The schema is stored as attributes labeling the graph's edges.... more
Semistructured data is characterized by the lack of any fixed and rigid schema, although typically the data has some implicit structure.
Many Web data sources and APIs make their data available in XML, JSON, or a domain-specific semi-structured format, with the goal of making the data easily accessible and usable by Web application developers. Although such data formats... more
Những hệ thống quản trị dữ liệu có chức năng quản lý dữ liệu, trích những thông tin hữu ích từ dữ liệu và sử dụng những thông tin này nhằm hỗ trợ việc ra quyết định. Vì thế những hệ thống này bao gồm cả các hệ thống cơ sở dữ liệu, các kho... more
Abstract. This paper presents structural recursion as the basis of the syntax and semantics of query languages for semistructured data and XML. We describe a simple and powerful query language based on pattern matching and show that it... more
We posit that a semistructured data model offers the right balance of rich structure and flexible (or lack of) schema allowing naive end users to record information in whatever form makes it easy for them to manage. We describe our... more
by G. Terracina and 
1 more
Nowadays, data can be represented and stored by using different formats ranging from non structured data, typical of file systems, to semistructured data, typical of Web sources, to highly structured data, typical of relational database... more
Semi-structured data has become prevalent with the growth of the Internet. The data is usually stored in a traditional database system or in a specialized repository. While many information providers have presented their databases on the... more
In this paper we propose the Graphical sEmistructured teMporal data model (GEM), which is based on labeled graphs and allows one to represent in a uniform way semistructured data and their temporal aspects. In particular, we focus on... more
The diversity and availability of information sources on the World Wide Web has set the stage for integration and reuse at an unparalleled scale. There remain signi cant hurdles to exploiting the extent of the Web's resources in a... more
All query languages proposed for semistructured data share as common characteristic the ability to traverse arbitrary long path in the data in the form of regular path expressions. The expressive power of these languages lies in between... more
This work presents the query language TQL, a query language for semistructured data, that can be used to query XML files. TQL substitutes the standard path-based pattern-matching mechanism with a logic-based mechanism, where the... more
Recent work on semi-structured data has revitalized the interest in path queries, i.e., queries that ask for all pairs of objects in the database that are connected by a path conforming to a certain specification, in particular to a... more
In this paper we introduce the representative object, which uncovers the inherent schema(s) in semistructured, hierarchical data sources and provides a concise description of the structure of the data. Semistructured data, unlike data... more
The paper addresses the problem of integrating data based on a XML-like semistructured model with a data grid architecture based on an object-oriented model. This work is continuation of the previous works on object-to-relational wrappers... more
XML is emerging as the "universal" language for semistructured data description/exchange, and new issues regarding the management of XML data, both in terms of performance and usability, are becoming critical. The application of... more
We propose combining query approximation and query relaxation techniques in order to support flexible querying of heterogeneous data arising from lifelong learners' educational and work experiences. A key aim of such querying facilities... more
Many Web data sources and APIs make their data available in XML, JSON, or a domain-specific semi-structured format, with the goal of making the data easily accessible and usable by Web application developers. Although such data formats... more
We show how the mechanisms of an active/deductive object-oriented model are capable of modeling the information available on the World Wide Web, the paradigmatic example of a large, distributed collection of poorly structured,... more
We consider semistructured data as rooted edge-labeled directed graphs, and path inclusion constraints on these graphs. In this paper, we show that we can extract from a finite datum D a finite set C f (D) of word inclusions, which... more
Abstract: Semistructured databases require tailor-made concurrency control mechanisms since traditional solutions for the relational model have been shown to be inadequate. Such mechanisms need to take full advantage of the hierarchical... more
Abstract. Semistructured databases require tailor-made concurrency control mechanisms since traditional solutions for the relational model have been shown to be inadequate. Such mechanisms need to take full advantage of the hierarchical... more
Những hệ thống quản trị dữ liệu có chức năng quản lý dữ liệu, trích những thông tin hữu ích từ dữ liệu và sử dụng những thông tin này nhằm hỗ trợ việc ra quyết định. Vì thế những hệ thống này bao gồm cả các hệ thống cơ sở dữ liệu, các kho... more
Những hệ thống quản trị dữ liệu có chức năng quản lý dữ liệu, trích những thông tin hữu ích từ dữ liệu và sử dụng những thông tin này nhằm hỗ trợ việc ra quyết định. Vì thế những hệ thống này bao gồm cả các hệ thống cơ sở dữ liệu, các kho... more
Structured data is easy to parse and process using simple software tools with effective results. However, most valuable information is typically found in unstructured data. The problem is that unstructured data... more
Những hệ thống quản trị dữ liệu có chức năng quản lý dữ liệu, trích những thông tin hữu ích từ dữ liệu và sử dụng những thông tin này nhằm hỗ trợ việc ra quyết định. Vì thế những hệ thống này bao gồm cả các hệ thống cơ sở dữ liệu, các kho... more
Những hệ thống quản trị dữ liệu có chức năng quản lý dữ liệu, trích những thông tin hữu ích từ dữ liệu và sử dụng những thông tin này nhằm hỗ trợ việc ra quyết định. Vì thế những hệ thống này bao gồm cả các hệ thống cơ sở dữ liệu, các kho... more