Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2002
…
16 pages
1 file
In-place reconstruction of delta compressed data allows information on devices with limited storage capability to be updated efficiently over low-bandwidth channels. Delta compression encodes a version of data compactly as a small set of changes from a previous version. Transmitting updates to data as delta versions saves both time and bandwidth. In-place reconstruction rebuilds the new version of the data in the storage or memory space the current version occupies-no additional scratch space is needed. By combining these technologies, we support large-scale, highly-mobile applications on inexpensive hardware. We present an experimental study of in-place reconstruction algorithms. We take a datadriven approach to determine important performance features, classifying files distributed on the Internet based on their in-place properties, and exploring the scaling relationship between files and data structures used by in-place algorithms. We conclude that in-place algorithms are I/O bound and that the performance of algorithms is most sensitive to the size of inputs and outputs, rather than asymptotic bounds.
results in high latency and low bandwidth to web-enabled clients and prevents the timely delivery of software. We present an algorithm for modifying delta compressed files so that the compressed versions may be reconstructed without scratch space. This allows network clients with limited resources to efficiently update software by retrieving delta compressed versions over a network. Delta compression for binary files, compactly encoding a version of data with only the changed bytes from a previous version, may be used to efficiently distribute software over low bandwidth channels, such as the Internet. Traditional methods for rebuilding these delta files require memory or storage space on the target machine for both the old and new version of the file to be reconstructed. With the advent of network computing and Internet-enabled devices, many of these network attached target machines have limited additional scratch space. We present an algorithm for modifying a delta compressed version file so that it may rebuild the new me version in the space that the current version occupies. Differential or delta compression [5, 11, compactly encoding a new version of a file using only the changed bytes from a previous version, can be used to reduce the size of the file to be transmitted and consequently the time to perform software update. Currently, decompressing delta encoded files requires scratch space, additional disk or memory storage , used to hold a required second copy of the file. Two copies of the compressed file must be concurrently available, as the delta file contains directives to read data from the old file version while the new file version is being materialized in another region of storage. This presents a problem. Network attached devices often have limited memory resources and no disks and therefore are not capable of storing two file versions at the same time. Furthermore, adding storage to network attached devices is not viable, as keeping these devices simple limits their production costs.
For backup storage, increasing compression allows users to protect more data without increasing their costs or storage footprint. Though removing duplicate re- gions (deduplication) and traditional compression have become widespread, further compression is attainable. We demonstrate how to efficiently add delta compres- sion to deduplicated storage to compress similar (non- duplicate) regions. A challenge when adding delta com- pression is the large number of data regions to be in- dexed. We observed that stream-informed locality is ef- fective for delta compression, so an index for delta com- pression is unnecessary, and we built the first storage sys- tem prototype to combine delta compression and dedu- plication with this technology. Beyond demonstrating extra compression benefits between 1.4-3.5X, we also investigate throughput and data integrity challenges that arise.
MASCOTS, 2013
Data compression and decompression utilities can be critical in increasing communication throughput, reducing communication latencies, achieving energy-efficient communication, and making effective use of available storage. This paper experimentally evaluates several such utilities for multiple compression levels on systems that represent current mobile platforms. We characterize each utility in terms of its compression ratio, compression and decompression throughput, and energy efficiency. We consider different use cases that are typical for modern mobile environments. We find a wide variety of energy costs associated with data compression and decompression and provide practical guidelines for selecting the most energy efficient configurations for each use case.
The data traffic originating on mobile computing devices has been growing exponentially over the last several years. Lossless data compression and decompression can be essential in increasing communication throughput, reducing communication latency, achieving energy-efficient communication, and making effective use of available storage. This paper experimentally evaluates several compression utilities and configurations on a modern smartphone. We characterize each utility in terms of its compression ratio, compression and decompression throughput, and energy efficiency for representative use cases. We find a wide variety of energy costs associated with data compression and decompression and provide practical guidelines for selecting the most energy efficient configurations for each use case. For data transfers over WLAN, the best configurations provide a 2.1-fold and 2.7-fold improvement in energy efficiency for compressed uploads and downloads, respectively, when compared to uncompressed data transfers. For data transfers over a mobile broadband network, the best configurations provide a 2.7-fold and 3-fold improvement in energy efficiency for compressed uploads and downloads, respectively.
Replicating data off-site is critical for disaster recovery reasons, but the current approach of transferring tapes is cumbersome and error-prone. Replicating across a wide area network (WAN) is a promising alternative, but fast network connections are expensive or impractical in many remote locations, so improved compression is needed to make WAN replication truly practical. We present a new technique for replicating backup datasets across a WAN that not only eliminates duplicate regions of files (deduplication) but also compresses similar regions of files with delta compression, which is available as a feature of EMC Data Domain systems. Our main contribution is an architecture that adds stream-informed delta compression to already existing deduplication systems and eliminates the need for new, persistent indexes. Unlike techniques based on knowing a file’s version or that use a memory cache, our approach achieves delta compression across all data replicated to a server at any time in the past. From a detailed analysis of datasets and hundreds of customers using our product, we achieve an additional 2X compression from delta compression beyond deduplication and local compression, which enables customers to replicate data that would otherwise fail to complete within their backup window.
IEEE Transactions on Knowledge and Data Engineering, 2003
In-place reconstruction of differenced data allows information on devices with limited storage capacity to be updated efficiently over low-bandwidth channels. Differencing encodes a version of data compactly as a set of changes from a previous version. Transmitting updates to data as a version difference saves both time and bandwidth. In-place reconstruction rebuilds the new version of the data in the storage or memory the current version occupies-no scratch space is needed for a second version. By combining these technologies, we support highly mobile applications on space-constrained hardware. We present an algorithm that modifies a differentially encoded version to be in-place reconstructible. The algorithm trades a small amount of compression to achieve this property. Our treatment includes experimental results that show our implementation to be efficient in space and time and verify that compression losses are small. Also, we give results on the computational complexity of performing this modification while minimizing lost compression.
Proceedings of the 15th IEEE/ACM Symposium on Embedded Systems for Real-Time Multimedia, 2017
Internet-connected mobile processors used in cellphones, tablets, and internet-of-things (IoT) devices are generating and transmitting data at an ever-increasing rate. ese devices are already the most abundant types of processor parts produced and used today and are growing in ubiquity with the rapid proliferation of mobile and IoT technologies. Size and usage characteristics of these data-generating systems dictate that they will continue to be both bandwidth-and energy-constrained. e most popular mobile applications, dominating communication bandwidth utilization for the entire internet, are centered around transmission of image, video, and audio content. For such applications, where perfect data quality is not required, approximate computation has been explored to alleviate system bo lenecks by exploiting implicit noise tolerance to trade o output quality for performance and energy bene ts. However, it is o en communication, not computation, that dominates performance and energy requirements in mobile systems. is is coupled with the increasing tendency to o oad computation to the cloud, making communication e ciency, not computation e ciency, the most critical parameter in mobile systems. Given this increasing need for communication e ciency, data compression provides one e ective means of reducing communication costs. In this paper, we explore approximate compression and communication to increase energy e ciency and alleviate bandwidth limitations in communication-centric systems. We focus on application-speci c approximate data compression, whereby a transmi ed data stream is approximated to improve compression rate and reduce data transmission cost. Whereas conventional lossy compression follows a one-size-ts-all mentality in selecting a compression technique, we show that higher compression rates can be achieved by understanding the characteristics of the input data stream and the application in which it is used. We introduce a suite of data stream approximations that enhance the compression rates of lossless compression algorithms by gracefully and e ciently trading o output quality for increased compression rate. For di erent classes of images, we explain the interaction between compression rate, output quality, and complexity of approximation and establish comparisons with existing lossy compression algorithms. Our approximate compression techniques increase compression rate and reduce bandwidth utilization by up to 10× with respect to state-ofthe-art lossy compression while achieving the same output quality and be er end-to-end communication performance.
2011
The proliferation of pictures and videos in the Internet is imposing heavy demands on mobile data networks. This demand is expected to grow rapidly and a one-fit-all solution is unforeseeable. While researchers are approaching the problem from different directions, we identify a human-centric opportunity to reduce content size. Our intuition is that humans exhibit unequal interest towards different parts of a content, and parts that are less important may be traded off for price/performance benefits. For instance, a picture with the Statue of Liberty against a blue sky may be partitioned into two categories -the semantically important statue, and the less important blue sky. When the need to minimize bandwidth/energy is acute, only the picture of the statue may be downloaded, along with a meta tag "background: blue sky". Once downloaded, an arbitrary "blue sky" may be suitably inserted behind the statue, reconstructing an approximation of the original picture. As long as the essence of the picture is retained from the human's perspective, such an approximation may be acceptable. This paper attempts to explore the scope and usefulness of this idea, and develop a broader research theme that we call context-aware compression.
Proceedings of the 6th International Joint Conference on Pervasive and Embedded Computing and Communication Systems, 2016
The importance of optimizing data transfers between mobile computing devices and the cloud is increasing with an exponential growth of mobile data traffic. Lossless data compression can be essential in increasing communication throughput, reducing communication latency, achieving energy-efficient communication, and making effective use of available storage. In this paper we introduce analytical models for estimating effective throughput and energy efficiency of uncompressed data transfers and compressed data transfers that utilize common compression utilities. The proposed analytical models are experimentally verified using state-of-the-art mobile devices. These models are instrumental in developing a framework for seamless optimization of data file transfers.
Inexpensive storage and more powerful processors have resulted in a proliferation of data that needs to be reliably backed up. Network resource limitations make it increasingly difficult to backup a distributed file system on a nightly or even weekly basis. By using delta compression algorithms , which minimally encode a version of a file using only the bytes that have changed, a backup system can compress the data sent to a server. With the delta backup technique, we can achieve significant savings in network transmission time over previous techniques. Our measurements indicate that file system data may, on average, be compressed to within 10% of its original size with this method and that approximately 45% of all changed files have also been backed up in the previous week. Based on our measurements, we conclude that a small file store on the client that contains copies of previously backed up files can be used to retain versions in order to generate delta files. To reduce the load on the backup server, we implement a modified version storage architecture, version jumping, that allows us to restore delta encoded file versions with at most two accesses to tertiary storage. This minimizes server work-load and network transmission time on file restore.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Data Management in Grid and Peer-to-Peer Systmes
International Journal of Foundations of Computer Science, 2008
International Journal of Communications, Network and System Sciences
ACM Transactions on Storage
Pattern Recognition, 2010
International Journal of Wireless and Microwave Technologies, 2016
Proceedings of the 21st international conference on Parallel architectures and compilation techniques - PACT '12, 2012