Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2000
…
9 pages
1 file
Most modern lossless data compression techniques used to- day, are based in dictionaries. If some string of data being compressed matches a portion previously seen, then such string is included in the dictionary and its reference is included every time it occurs. A possi- ble generalization of this scheme is to consider not only strings made of consecutive symbols, but
MICAI 2004: Advances in Artificial Intelligence, 2004
Most modern lossless data compression techniques used today, are based in dictionaries. If some string of data being compressed matches a portion previously seen, then such string is included in the dictionary and its reference is included every time it appears. A possible generalization of this scheme is to consider not only strings made of consecutive symbols, but more general patterns with gaps between its symbols. The main problems with this approach are the complexity of pattern discovery algorithms and the complexity for the selection of a good subset of patterns. In this paper we address the last of these problems. We demonstrate that such problem is NP-complete and we provide some preliminary results about heuristics that points to its solution.
International Journal of Engineering Research and, 2015
Dictionary Based Compression is a useful technique through which we can encode variable-length strings of symbols as single tokens. There are number of algorithms available for Dictionary Based Compression. It uses less computing resources so it is very effective compression technique. The purpose of this paper is to present and analyze a variety of dictionary based algorithms.
In this paper, a new approach on dictionary-based lossless compression method is introduced. In our two-pass compression algorithm (SSDC), most frequently used two character blocks (digrams) are found in source file in the first-pass, and they are inserted into free spaces in ASCII table which are unused by the document in the second-pass. In our multi-pass algorithm (RSSDC), the two-pass algorithm is called particular number of times recursively. In each iteration, "total free space / total number of iteration" of the free spaces in the table is filled. In order to increase compression ratio, we also extend the ASCII table to 512 characters, by increasing bits per character from 8 to 9.
International Journal of Computer Applications, 2011
Compression is used just about everywhere. Reduction of both compression ratio and retrieval of data from large collection is important in today"s era. We propose a pre-compression technique that can be applied to text files. The output of our technique can be further applied to standard compression techniques available, such as arithmetic coding and BZIP2, which yields in better compression ratio. The algorithm suggested here uses the dynamic dictionary created at run-time and is also suitable for searching the phrases from the compressed file.
BIT, 1985
A new technique for compression of character strings is presented. The technique is based on the use of a dictionary forest which is built simultaneously with the encoding and decoding. Codes representing substrings are addresses in the dictionary forest. Experimental results show that the length of the text can be reduced more than 50 ~ with no a priori knowledge of the nature of the text.
2006
In this paper we introduce dictionary-symbolwise data compression schemes. We describe a method that, under some natural hypothesis, allows to obtain an optimal parse of any input or of any part of an input. This method can also be used to approximate the optimal parse in the general case and, under some additional hypothesis, it gives rise to on-line data compression algorithms. Therefore it could be used to improve many common compression programs. As second main contribution, we show how to use DAWG's and CDAWG's in a variant of the LZ77 compression scheme. In particular, we give an on-line linear implementation of our method in the case of dictionary-symbolwise algorithms with unbounded history and any on-line statistical coding.
Software - Practice and Experience, 2005
One of the attractive ways to increase text compression is to replace words with references to a text dictionary given in advance. Although there exist a few works in this area, they do not fully exploit the compression possibilities or consider alternative preprocessing variants for various compressors in the latter phase. In this paper, we discuss several aspects of dictionary-based compression, including compact dictionary representation, and present a PPM/BWCA oriented scheme, word replacing transformation, achieving compression ratios higher by 2-6% than stateof-the-art StarNT (2003) text preprocessor, working at a greater speed. We also present an alternative scheme designed for LZ77 compressors, with the advantage over StarNT reaching up to 14% in combination with gzip.
International Journal of Computer Applications, 2010
The advent of modern electronic world has opened up various fronts in multimedia interaction. They are used in various fields for various purposes of education, entertainment, research and many more. This has led to storage and retrieval of multimedia content regularly. But due to limitations of current technology the disk space and the transmission bandwidth fall behind in the race with the requirement of multimedia content. This imposes a need to compress multimedia content so that they can be easily stored requiring lesser space and easily transferred from one point to another. Some online dictionary based compression technique can be applied to reduce the data packet size. When the repetition rate of the same symbols within the data are high the compression techniques works very well. During the process of encoding and decoding, the building of online dictionary in the primary memory ensures the single pass over the data, and the dictionary need not to be transmitted over the network. Our proposed Improved Dictionary technique scans the data byte-wise, so that the chances of repetition of individual symbols are higher for text messages. Fixed length coding transmits fixed length codes for all dictionary entries. For bigger messages better optimization in terms of size reduction can be achieved through variable length coding with L-Z technique, where transmitted code length corresponding to individual dictionary entries will vary according to the requirement dynamically.
Information Sciences, 2001
Because of the size of information involved with the emerging applications in multimedia and the Human Genome Project, parallelism offers the only hope of meeting the challenges of storing such databases and searching through quickly. In this paper, we address dictionary based lossless text compression and give the state-of-the-art in the field of parallelism. Static dictionary compression and sliding window
A data compression scheme that exploits locality of reference, such as occurs when words are used frequently over short intervals and then fall into long periods of disuse, is described. The scheme is based on a simple heuristic for self-organizing sequential search and on variable-length encodings of integers. We prove that it never performs much worse than Huffman coding and can perform substantially better; experiments on real files show that its performance is usually quite close to that of Huffman coding. Our scheme has many implementation advantages: it is simple, allows fast encoding and decod- ing, and requires only one pass over the data to be com- pressed (static Huffman coding takes huo passes).
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
International Journal of Computer and Communication Technology, 2015
Journal of Discrete Algorithms, 2013
Proceedings of the IEEE, 2000
… SEA 2011, Kolimpari …, 2011
Proceedings DCC 2002. Data Compression Conference, 2002
2009 International Symposium on Signals, Circuits and Systems, 2009