XML COMPACTION IMPROVEMENTS BASED ON BINARY STRING ENCODINGS

Ramez Alkhatib

XML COMPACTION IMPROVEMENTS BASED ON BINARY STRING ENCODINGS

Ramez Alkhatib

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Due to the flexibility and the easy use of XML, it is nowadays widely used in a vast number of application areas and new information is increasingly being encoded as XML documents. Therefore, it is important to provide a repository for XML documents, which supports efficient management and storage of XML data. For this purpose, many proposals have been made, the most common ones are node labeling schemes. On the other hand, XML repeatedly uses tags to describe the data itself. This self-describing nature of XML makes it verbose with the result that the storage requirements of XML are often expanded and can be excessive. In addition, the increased size leads to increased costs for data manipulation. Therefore, it also seems natural to use compression techniques to increase the efficiency of storing and querying XML data. In our previous works, we aimed at combining the advantages of both areas (labeling and compaction technologies), Specially, we took advantage of XML structural peculiarities for attempting to reduce storage space requirements and to improve the efficiency of XML query processing using labeling schemes. In this paper, we continue our investigations on variations of binary string encoding forms to decrease the label size. Also We report the experimental results to examine the impact of binary string encoding on the query performance and the storage size needed to store the compacted XML documents.

Computer Science & Information Technology (CS & IT) Computer Science Conference Proceedings (CSCP)

Since Extensible Markup Language abbreviated as XML, became an official World Wide Web Consortium recommendation in 1998, XML has emerged as the predominant mechanism for data storage and exchange, in particular over the World Web. Due to the flexibility and the easy use of XML, it is nowadays widely used in a vast number of application areas and new information is increasingly being encoded as XML documents. Because of the widespread use of XML and the large amounts of data that are represented in XML, it is therefore important to provide a repository for XML documents, which supports efficient management and storage of XML data. Since the logical structure of an XML document is an ordered tree consisting of tree nodes, establishing a relationship between nodes is essential for processing the structural part of the queries. Therefore, tree navigation is essential to answer XML queries. For this purpose, many proposals have been made, the most common ones are node labeling schemes. On the other hand, XML repeatedly uses tags to describe the data itself. This self-describing nature of XML makes it verbose with the result that the storage requirements of XML are often expanded and can be excessive. In addition, the increased size leads to increased costs for data manipulation. Therefore, it also seems natural to use compression techniques to increase the efficiency of storing and querying XML data. In our previous works, we aimed at combining the advantages of both areas (labeling and compaction technologies), Specially, we took advantage of XML structural peculiarities for attempting to reduce storage space requirements and to improve the efficiency of XML query processing using labeling schemes. In this paper, we continue our investigations on variations of binary string encoding forms to decrease the label size. Also We report the experimental results to examine the impact of binary string encoding on reducing the storage size needed to store the compacted XML documents.

Log In

XML COMPACTION IMPROVEMENTS BASED ON BINARY STRING ENCODINGS

Sign up for access to the world's latest research

Abstract

Related papers