ItCompress: an iterative semantic compression algorithm

H.V. Jagadish; R.T. Ng; Beng Chin Ooi; A.K.H. Tung

ItCompress: an iterative semantic compression algorithm

Raymond Ng

2004, Proceedings. 20th International Conference on Data Engineering

visibility

…

description

12 pages

link

1 file

Sign up for access to the world's latest research

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact

Abstract

Real datasets are often large enough to necessitate data compression. Traditional 'syntactic' data compression methods treat the table as a large byte string and operate at the byte level. The tradeoff in such cases is usually between the ease of retrieval (the ease with which one can retrieve a single tuple or attribute value without decompressing a much larger unit) and the effectiveness of the compression. In this regard, the use of semantic compression has generated considerable interest and motivated certain recent works.

Saad Darwish

Journal of Software, 2015

In the last years, that amount of data stored in databases has increased extremely with the widespread use of databases and the rapid adoption of information systems and data warehouse technologies. It is a challenge to store and recover this increased data in an efficient method. This challenge will potentially appeal in database systems for two causes: storage cost reduction and performance improvement. Lossy compression in databases can return better compression ratios than lossless compression in general, but is rarely used due to the concern of losing data. For relational databases, using standard compression techniques like Gzip or Zip don't take advantage of the relational properties; since these techniques don't look at the nature of the data. In this paper, we propose a database compression system that takes advantage of attributes semantics and data-mining models to find frequent attribute pattern with maximum gain to perform compression of massive table's data. Furthermore, the suggested system relies on augmented vector quantization (AVQ) algorithm to achieve lossless compression version without losing any information. Extensive experiments were conducted and the results indicate the superiority of the system with respect to previously known techniques.

Log In

ItCompress: an iterative semantic compression algorithm

Sign up for access to the world's latest research

Abstract

Related papers

Related papers

Related topics