Efficient Compression and Storage of XML OLAP Cubes

Doulkifli Boukraa; Mohammed Amin Bouchoukh; Omar Boussaid

Efficient Compression and Storage of XML OLAP Cubes

Mohammed Amin Bouchoukh

2014, International Journal of Data Warehousing and Mining

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

In this paper, the authors present an approach to efficiently compress XML OLAP cubes. They propose a multidimensional snowflake schema of the cube as the basic physical configuration. The cube is then composed of one XML fact document and as many XML documents as the dimension hierarchy members. The basic configuration is reorganized into two ways by adding data redundancy on purpose in order to achieve a better compression ratio on the one hand and to improve query response time on the other hand. In the second configuration, all the documents of the cube are merged into one single XML document. In the third configuration, each reference between the fact and the dimensions or between the members of a dimension hierarchy is replaced by the whole XML referenced fragments. To the three physical configurations of the cube, the authors apply a new compression technique named XCC. They demonstrate the efficiency of the third configuration before and after compression and they also show the efficiency of their compression technique when applied to XML OLAP cubes.

Yannis Kouvaras, Nikos Karayannidis

Lecture Notes in Computer Science, 2004

Hierarchical clustering has been proved an effective means for physically organizing large fact tables since it reduces significantly the I/O cost during ad hoc OLAP query evaluation. In this paper, we propose a novel multidimensional file structure for organizing the most detailed data of a cube, the CUBE File. The CUBE File achieves hierarchical clustering of the data, enabling fast access via hierarchical restrictions. Moreover, it imposes a low storage cost and adapts perfectly to the extensive sparseness of the data space achieving a high compression rate. Our results show that the CUBE File outperforms the most effective method proposed up to now for hierarchically clustering the cube, resulting in 7-9 times less I/Os on average for all workloads tested. Thus, it achieves a higher degree of hierarchical clustering. Moreover, the CUBE File imposes a 2-3 times lower storage cost.

Log In

Efficient Compression and Storage of XML OLAP Cubes

Sign up for access to the world's latest research

Abstract

Related papers

Related topics