In-place reconstruction of delta compressed files

Randal C. Burns; Darrell D. E. Long

In-place reconstruction of delta compressed files

1998, Proceedings of the seventeenth annual ACM symposium on Principles of distributed computing - PODC '98

Abstract

results in high latency and low bandwidth to web-enabled clients and prevents the timely delivery of software. We present an algorithm for modifying delta compressed files so that the compressed versions may be reconstructed without scratch space. This allows network clients with limited resources to efficiently update software by retrieving delta compressed versions over a network. Delta compression for binary files, compactly encoding a version of data with only the changed bytes from a previous version, may be used to efficiently distribute software over low bandwidth channels, such as the Internet. Traditional methods for rebuilding these delta files require memory or storage space on the target machine for both the old and new version of the file to be reconstructed. With the advent of network computing and Internet-enabled devices, many of these network attached target machines have limited additional scratch space. We present an algorithm for modifying a delta compressed version file so that it may rebuild the new me version in the space that the current version occupies. Differential or delta compression [5, 11, compactly encoding a new version of a file using only the changed bytes from a previous version, can be used to reduce the size of the file to be transmitted and consequently the time to perform software update. Currently, decompressing delta encoded files requires scratch space, additional disk or memory storage, used to hold a required second copy of the file. Two copies of the compressed file must be concurrently available, as the delta file contains directives to read data from the old file version while the new file version is being materialized in another region of storage. This presents a problem. Network attached devices often have limited memory resources and no disks and therefore are not capable of storing two file versions at the same time. Furthermore, adding storage to network attached devices is not viable, as keeping these devices simple limits their production costs.

Due to the booming growth of Information and Communication Technology (ICT), a vast amount of data is produced at a considerably high rate and it drives the traditional methods of storing data to its limits and most of the time it simply overwhelms the current storage systems. Because of that throughout the history of the development of ICT, the effort to find an efficient and feasible data storage system that has a substantial capacity to cater the current data storage needs has been relentless.Currently available shared storage devices are mostly file servers and Peer-to-Peer systems which are organized in various different architectures but there are certain areas that pose problems in implementing such systems at small scale and also at enterprise level. Networked Shared Storage System(NSS) is introduced as a system motivated by that historical desire to achieve the ultimate reliable and secure storage media and it represents the way ahead in discovering the ultimate solution for this long lasting problem. NSS is a Local Area Network (LAN) based, secure and reliable distributed storage system. Its primary objective is to use the free local hard disk space available in the workstations connected to a LAN, as its storage media. This is achieved by the radical but fail proof method of breaking down the single file in to a set of data chunks and distributing them throughout the LAN. These chunks are then remerged to reproduce the original file at the users’ request. Through this method the largely the unutilized free disk space of nodes connected to the LAN is used to create a free disk space pool that will serve the storage needs of the users of that same network, rather than incorporating separate data servers for that. A LAN based storage system is invariably challenged by the inherent unavailability of the nodes of a LAN. But the NSS overcomes this problem via a robust and efficient data replication algorithm that makes replicas of the data chunks when storing them. Thus providing a high percentage of availability/reliability for the stored data.Peer-to-Peer communication is used when distributing the chunked data throughout the network via the embedded FTP servers. This architecture will minimize the security issues and protect the privacy of data which is greatly challenged in a LAN based environment. NSS is highly scalable and applicable for both medium scale LAN and its enterprise level equivalent, with no additional modification to the architecture and with lesser cost and effort than most existing solutions (Cloud Servers and Server Farms). Thus making it the way ahead in achieving the ultimate storage m

Log In

In-place reconstruction of delta compressed files

Sign up for access to the world's latest research

Abstract

Related papers

Related topics