Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2004
Parallel disks provide a cost effective way of speeding up I/Os in applications that work with large amounts of data. The main challenge is to achieve as much parallelism as possible, using prefetching to avoid bottlenecks in disk access. Efficient algorithms have been developed for some particular patterns of accessing the disk blocks. In this paper, we consider general request sequences. When the request sequence consists of unique block requests, the problem is called prefetching and is a well-solved problem for arbitrary request sequences. When the reference sequence can have repeated references to the same block, we need to devise an effective caching policy as well. While optimum offline algorithms have been recently designed for the problem, in the online case, no effective algorithm was previously known. Our main contribution is a deterministic online algorithm threshold-LRU which achieves O((M D/L) 2/3 ) competitive ratio and a randomized online algorithm threshold-MARK which achieves O( p (M D/L) log(M D/L)) competitive ratio for the caching/prefetching problem on the parallel disk model (PDM), where D is the number of disks, M is the size of fast memory buffer, and M + L is the amount of lookahead available in the request sequence. The best-known lower bound on the competitive ratio is Ω( p M D/L) for lookahead L ≥ M in both models. We also show that if the deterministic online algorithm is allowed to have twice the memory of the offline then a tight competitive ratio of Θ( p M D/L) can be achieved. This problem generalizes the well-known paging problem on a single disk to the parallel disk model.
We consider the natural extension of the well-known single disk caching problem to the parallel disk I/O model (PDM) [17]. The main challenge is to achieve as much par-allelism as possible and avoid I/O bottlenecks. We are given a fast memory (cache) of size M memory blocks along with a request sequence Σ = (b1, b2, ..., bn) where each block bi resides on one of D disks. In each parallel I/O step, at most one block from each disk can be fetched. The task is to serve Σ in the minimum number of parallel I/Os. Thus, each I/O is analogous to a page fault. The difference here is that during each page fault, up to D blocks can be brought into memory, as long as all of the new blocks entering the memory reside on different disks. The problem has a long history [18, 12, 13, 26]. Note that this problem is non-trivial even if all requests in Σ are unique. This restricted version is called read-once. Despite the progress in the offline version [13, 15] and read-once version [12], the general online problem still remained open. Here, we provide comprehensive results with a full general solution for the problem with asymptotically tight competitive ratios. To exploit parallelism, any parallel disk algorithm needs a certain amount of lookahead into future requests. To provide effective caching, an online algorithm must achieve o(D) competitive ratio. We show a lower bound that states, for lookahead L ≤ M , any online algorithm must be Ω(D)-competitive. For lookahead L greater than M (1 + 1//), where is a constant, the tight upper bound of O(p M D/L) on competitive ratio is achieved by our algorithm SKEW. The previous algorithm tLRU [26] was O((M D/L) 2/3) competitive and this was also shown to be tight [26] for an LRU-based strategy. We achieve the tight ratio using a fairly * different strategy than LRU. We also show tight results for randomized algorithms against oblivious adversary and give an algorithm achieving better bounds in the resource augmentation model.
Proceedings. Fifth International Conference on High Performance Computing (Cat. No. 98EX238)
We address the problems of prefetching and I/O scheduling for read-once reference strings in a parallel I/O system. Read-once reference strings, in which each block is accessed exactly once, arise naturally in applications like databases and video retrieval. Using the standard parallel disk model with £ disks and a shared I/O buffer of size ¤ , we present a novel algorithm, Red-Black Prefetching (RBP), for parallel I/O scheduling. The number of parallel I/Os performed by RBP is within O(£ ¦ ¥ § ©) of the minimum possible. Algorithm RBP is easy to implement and requires computation time linear in the length of the reference string. Through simulation experiments we validated the benefits of RBP over simple greedy prefetching.
DIMACS Series in Discrete Mathematics and Theoretical Computer Science, 1999
In this work we address the problems of prefetchingand I O scheduling for read-once reference strings in a parallel I O system. We use the standard parallel disk model with D disks a shared I O bu er of size M. W e design an on-line algorithm ASP Adaptive Segmented Prefetching with M L-block lookahead, L 1, and compare its performance to the best on-line algorithm with the same lookahead. We show that for any reference string the numberof I Os done by ASP is with a factor C, C = minf p L; D 1=3 g, of the number of I Os done by the optimal algorithm with the same amount o f l o o k ahead.
IEEE Transactions on Computers, 2002
We address the problem of prefetching and caching in a parallel I/O system and present a new algorithm for parallel disk scheduling. Traditional buffer management algorithms that minimize the number of block misses are substantially suboptimal in a parallel I/O system where multiple I/Os can proceed simultaneously. We show that in the offline case, where a priori knowledge of all the requests is available, PC-OPT performs the minimum number of I/Os to service the given I/O requests. This is the first parallel I/O scheduling algorithm that is provably offline optimal in the parallel disk model. In the online case, we study the context of global L-block lookahead, which gives the buffer management algorithm a lookahead consisting of L distinct requests. We show that the competitive ratio of PC-OPT, with global L-block lookahead, is ÂðM À L þ DÞ, when L M, and ÂðMD=LÞ, when L > M, where the number of disks is D and buffer size is M.
Journal of Algorithms, 2000
We provide a competitive analysis framework for online prefetching and buffer management algorithms in parallel I/O systems, using a read-once model of block references. This has widespread applicability to key I/O-bound applications such as external merging and concurrent playback of multiple video streams. Two realistic lookahead models, global lookahead and local lookahead, are defined. Algorithms NOM and GREED based on these two forms of lookahead are analyzed for shared buffer and distributed buffer configurations, both of which occur frequently in existing systems. An important aspect of our work is that we show how to implement both the models of lookahead in practice using the simple techniques of forecasting and flushing. Given a ¤-disk parallel I/O system and a globally shared I/O buffer that can hold upto ¥ disk blocks, we derive a lower bound of ¦ § © ¤ on the competitive ratio of any deterministic online prefetching algorithm with § ¥ lookahead. NOM is shown to match the lower bound using global ¥-block lookahead. In contrast, using only local lookahead results in an ¦ § ¤ competitive ratio. When the buffer is distributed into
2001
Prefetching and caching are widely used for improving the performance of file systems. Recent studies have shown that it is important to integrate the two. In this model we consider the following problem. Suppose that a program makes a sequence of m accesses to data blocks. The cache can hold k blocks, where k < m. An access to a block in the cache incurs one time unit, and fetching a missing block incurs d time units. A fetch of a new block can be initiated while a previous fetch is in progress. Thus, d block fetches can be in progress simultaneously. The locality of references to the cache is captured by the access graph model of [2]. The goal is to find a policy for prefetching and caching, which minimizes the overall execution time of a given reference sequence. This problem is called caching with locality and pipelined prefetching (CLPP). Our study is motivated from the pipelined operation of modern memory controllers, and program execution on fast processors. For the offline case we show that an algorithm introduced in [4] is optimal. In the online case we give an algorithm which is within factor of 2 from the optimal in the set of online deterministic algorithms, for any access graph, and k,d≥1. Improved ratios are obtained for several important classes of access graphs, including complete graphs and directed acyclic graphs (DAG). Finally, we study the CLPP problem assuming a Markovian access model, on branch trees, which often arise in applications. We give algorithms whose expected performance ratios are within factor 2 from the optimal.
Proceedings of the 2021 ACM-SIAM Symposium on Discrete Algorithms (SODA)
In the parallel paging problem, there are p processors that share a cache of size k. The goal is to partition the cache among the processors over time in order to minimize their average completion time. For this long-standing open problem, we give tight upper and lower bounds of Θ(log p) on the competitive ratio with O(1) resource augmentation. A key idea in both our algorithms and lower bounds is to relate the problem of parallel paging to the seemingly unrelated problem of green paging. In green paging, there is an energy-optimized processor that can temporarily turn off one or more of its cache banks (thereby reducing power consumption), so that the cache size varies between a maximum size k and a minimum size k/p. The goal is to minimize the total energy consumed by the computation, which is proportional to the integral of the cache size over time. We show that any efficient solution to green paging can be converted into an efficient solution to parallel paging, and that any lower bound for green paging can be converted into a lower bound for parallel paging, in both cases in a black-box fashion. We then show that, with O(1) resource augmentation, the optimal competitive ratio for deterministic online green paging is Θ(log p), which, in turn, implies the same bounds for deterministic online parallel paging.
All server storage environments depend on disk arrays to satisfy their capacity, reliability, and availability require- ments. In order to manage these storage systems efficiently, it is necessary to understand the behavior of disk arrays and predict their performance. We develop an analytical model that estimates mean performance measures of disk arrays under a synchronous I/O workload. Synchronous I/O requests are generated by jobs that each block while their request is serviced. Upon I/O service completion, a job may use other computer resources before issuing another I/O request. Our disk array model considers the effect of workload sequentiality, read-ahead caching, write-back caching, and other complex optimizations incorporated into most disk arrays. The model is validated against a mid-range disk-array for a variety of synthetic I/O workloads. The model is computationally simple and scales easily as the number of jobs issuing requests increases, making it potentially useful ...
Parallel Computing, 1997
We consider the problem of sorting a file of N records on the D-disk model of parallel I/O in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or from each of the D-disks in parallel. We propose a simple, efficient, randomized mergesort algorithm called SRM that uses a forecast-and-flush approach to overcome the inherent difficulties of simple merging on parallel disks. SRM exhibits a limited use of randomization and also has a useful deterministic version. Generalizing the technique of forecasting, our algorithm is able to read in, at any time, the 'right' block from any disk and using the technique of flushing, our algorithm evicts, without any I/O overhead, just the 'right' blocks from memory to make space for new ones to be read in. The disk layout of SRM is such that it enjoys perfect write parallelism, avoiding fundamental inefficiencies of previous mergesort algorithms. By analysis of generalized maximum occupancy problems we are able to derive an analytical upper bound on SRM's expected overhead valid for arbitrary inputs. The upper bound derived on expected I/O performance of SRM indicates that SRM is provably better than disk-striped mergesort (DSM) for realistic parameter values D, M and B. Average-case simulations show further improvement on the analytical upper bound. Unlike previously proposed optimal sorting algorithms, SRM outperforms DSM even when the number D of parallel disks is small.
ACM Transactions on Computer Systems, 1987
A continuum of disk scheduling algorithms, V(R), having endpoints V(0) = SSTF and V(1) = SCAN, is defined. V(R) maintains a current SCAN direction (in or out) and services next the request with the smallest effective distance. The effective distance of a request that lies in the current direction is its physical distance (in cylinders) from the read/write head. The effective distance of a request in the opposite direction is its physical distance plus R X (total number of cylinders on the disk). By use of simulation methods, it is shown that this definitional continuum also provides a continuum in performance, both with respect to the mean and with respect to the standard deviation of request waiting time. For objective functions that are linear combinations of the two measures, pw + kuw, intermediate points of the continuum are seen to provide performance uniformly superior to both SSTF and SCAN. A method of implementing V(R) and the results of its experimental use in a real system are presented.
Lecture Notes in Computer Science, 1998
We address the problem of I O scheduling of read-once reference strings in a multiple-disk parallel I O system. We present a n o vel online algorithm, Red-Black Prefetching RBP, for parallel I O scheduling. In order to perform accurate prefetching RBP uses L-block l o o k ahead. The performance of RBP is analyzed in the standard parallel disk model with D independent disks and a shared I O bu er of size M. W e show that the number of parallel I Os performed by RBP is within a factot maxf p M D = L ; D 1=3 g of the number of I Os done by the optimal oline algorithm. This ratio is within a canstant factor of the best possible when L is L = OM D 1=3 .
2000
High performance applications involving large data sets require the e cient and exible use of multiple disks. In an external memory machine with D parallel, independent disks, only one block can be accessed on each disk in one I/O step. This restriction leads to a load balancing problem that is perhaps the main inhibitor for the e cient adaptation of single-disk external memory algorithms to multiple disks. We solve this problem for arbitrary access patterns by randomly mapping blocks of a logical address space to the disks.
Information Processing Letters, 2019
The distributed multi-level multi-server paging problem (DMLMSP) defined in this paper extends and generalizes the classical distributed paging problem [3] to a distributed concurrent multi-level setting, in which multiple servers share caches at multiple levels. The DMLMSP can be used for modeling algorithms for efficient distributed storage systems, in which multiple servers use caches for accelerating access to a centralized storage, maintaining cache coherency across multiple nodes while minimizing access latency and optimizing cache-hit ratio. The DMLMSP model fits basic principles of Non-Uniform Cache Architectures (NUCAs) and can be used for analyzing multiple-level centralized storage. We present an optimal offline algorithm for the DMLMSP model, with minimum number of page faults and minimum makespan, whose time complexity is polynomial in the length of the servers' page request sequences. The new algorithm generalizes and simplifies the state of the art algorithms [3,1].
Parallel Computing, 2002
For the design and analysis of algorithms that process huge data sets, a machine model is needed that handles parallel disks. There seems to be a dilemma between simple and flexible use of such a model and accurate modeling of details of the hardware. This paper explains how many aspects of this problem can be resolved. The programming model implements one large logical disk allowing concurrent access to arbitrary sets of variable size blocks. This model can be implemented efficiently on multiple independent disks even if zones with different speed, communication bottlenecks and failed disks are allowed. These results not only provide useful algorithmic tools but also imply a theoretical justification for studying external memory algorithms using simple abstract models. The algorithmic approach is random redundant placement of data and optimal scheduling of accesses. The analysis generalizes a previous analysis for simple abstract external memory models in several ways (higher efficiency, variable block sizes, more detailed disk model).
1996
AbstractÐThe I/O performance of applications in multiple-disk systems can be improved by overlapping disk accesses. This requires the use of appropriate prefetching and buffer management algorithms that ensure the most useful blocks are accessed and retained in the buffer. In this paper, we answer several fundamental questions on prefetching and buffer management for distributed-buffer parallel I/O systems. First, we derive and prove the optimality of an algorithm, P-min, that minimizes the number of parallel I/Os. Second, we analyze P-con, an algorithm that always matches its replacement decisions with those of the well-known demand-paged MIN algorithm. We show that P-con can become fully sequential in the worst case. Third, we investigate the behavior of on-line algorithms for multiple-disk prefetching and buffer management. We define and analyze P-lru, a parallel version of the traditional LRU buffer management algorithm. Unexpectedly, we find that the competitive ratio of P-lru ...
ACM Transactions on Algorithms, 2009
We study an optimization problem that arises in the context of data placement in multimedia storage systems. We are given a collection of M multimedia data objects that need to be assigned to a storage system consisting of N disks d 1 ,d 2 ,...,d N. We are also given sets U 1 ,U 2 ,...,U M such that U i is the set of clients requesting the ith data object. Each disk d j is characterized by two parameters, namely, its storage capacity C j which indicates the maximum number of data objects that may be assigned to it, and a load capacity L j which indicates the maximum number of clients that it can serve. The goal is to find a placement of data objects on disks and an assignment of clients to disks so as to maximize the total number of clients served, subject to the capacity constraints of the storage system. We study this data placement problem for two natural classes of storage systems, namely, homogeneous and uniform ratio. Our first main result is a tight upper and lower bound on the number of items that can always be packed for any input instance to homogeneous as well as uniform ratio storage systems. We show that an algorithm given in [11] for data placement, achieves this bound. Our second main result is a polynomial time approximation scheme for the data placement problem in homogeneous and uniform ratio storage systems, answering an open question of [11]. Finally, we also study the problem from an empirical perspective. Comments Comments Postprint version.
IEEE Transactions on Computers, 2003
Random redundant data storage strategies have proven to be a good choice for efficient data storage in multimedia servers. These strategies lead to a retrieval problem in which it is decided for each requested data block which disk to use for its retrieval. In this paper, we give a complexity classification of retrieval problems for random redundant storage. Index Terms-Random redundant storage, load balancing, video servers, complexity analysis. ae 1 INTRODUCTION A multimedia server [13] offers continuous streams of multimedia data to multiple users. In a multimedia server, one can generally distinguish three parts: an array of hard disks to store the data, an internal network, and fast memory used for buffering. The latter is usually implemented in random access memory (RAM). The multimedia data is stored on the hard disks in blocks such that a data stream is realized by periodically reading a block from disk and storing it in the buffer, from which the stream can be consumed in a continuous way. A block generally contains a couple of hundred milliseconds of video data. The total buffer space is split up into a number of buffers, one for each user. A user consumes, possibly at a variable bit rate, from his/her buffer and the buffer is repeatedly refilled with blocks from the hard disks. A buffer generates a request for a new block as soon as the amount of data in the buffer becomes smaller than a certain threshold. We assume that requests are handled periodically in batches, in a way that the requests that arrive in one period are serviced in the next one [16]. In the server, we need a cost-efficient storage and retrieval strategy that guarantees, either deterministically or probabilistically, that the buffers do not underflow or overflow. Load balancing is very important within a multimedia server, as efficient usage of the available bandwidth of the disk array increases the maximum number of users that can be serviced simultaneously, which results in lower cost per user. Random redundant storage strategies have proven to enable a good load balancing performance [1], [3], [15], [23]. In these storage strategies, each data block is stored more than once, on different, randomly chosen disks. This data redundancy gives the freedom to obtain a balanced load with high probability. To exploit this freedom, an algorithm is needed to solve, in each period, a retrieval problem, i.e., we have to decide, for each data block, from which disk(s) to retrieve it in such a way that the load is balanced.
IEEE Transactions on Computers, 1991
Disk interleaving, or disk striping, distributes a data block across a group of disks and allows parallel transfer of data. Disk interleaving is achieved by dividing a data block into a number of subblocks and placing each subblock on a separate disk. A subblock can be stored on an interleaved disk at a predetermined location (relative to the adjacent subblocks), or it can be stored at any location on the disk. We consider a system where adjacent subblocks are placed independently of each other, we call it an asynchronous disk interleaving system, and analyze its performance implications. Since each of the disks in such a system is treated independently while being accessed as a group, the access delay of a request for a data block in an n-disk system is the maximum of n access delays. Using approximate analysis, we obtain a simple expression for the expected value of such a maximum delay. The analytic approximation is verified by simulation using trace data, the relative error is found to be at most 6%.
1997
We consider the problem of sorting a file of N records on the D-disk model of parallel I/O in which there are two sources of parallelism. Records are transferred to and from disk concurrently in blocks of B contiguous records. In each I/O operation, up to one block can be transferred to or from each of the D disks in parallel. We propose a simple, efficient, randomized mergesort algorithm called SRM that uses a forecast-and-flush approach to overcome the inherent difficulties of simple merging on parallel disks. SRM exhibits a limited use of randomization and also has a useful deterministic version. Generalizing the technique of forecasting, our algorithm is able to read in, at any time, the "right" block from any disk, and using the technique of flushing, our algorithm evicts, without any I/O overhead, just the "right" blocks from memory to make space for new ones to be read in. The disk layout of SRM is such that it enjoys perfect write parallelism, avoiding fundamental inefficiencies of previous mergesort algorithms. By analysis of generalized maximum occupancy problems we are able to derive an analytical upper bound on SRM's expected overhead valid for arbitrary inputs. The upper bound derived on expected I/O performance of SRM indicates that SRM is provably better than disk-striped mergesort (DSM) for realistic parameter values D, M , and B. Average-case simulations show further improvement on the analytical upper bound. Unlike previously proposed optimal sorting algorithms, SRM outperforms DSM even when the number D of parallel disks is small.
Computing, 1978
A Performance Model for Preplanned Disk Sorting. The idea of preplanning strings on disks which are merged together is investigated from a performance point of view. Schemes of internal buffer allocation, initial string creation by an internal sort, and string distribution on disks are evaluated. An algorithm is given for the construction of suboptimal merge trees called plannable merge trees. A cost model is presented for accurate preplanning which consists of detailed assumptions on disk allocation for k input disks and r-way merge planning. Timing considerations for sort and merge including hardware characteristics of moveable head disks show a significant gain of time compared to widely used sort/merge applications. Leistungsanalyse beim vorgeplanten Sortieren auf Magnctplatten. Der Ansatz, Strings, die zusammengemischt werden, auf Magnetplatten vorzuplanen, wird unter Leistungsgesichtspunkten untersucht. Konzepte far die interne Pufferzuordnung, fiir die Erzeugung der anf'finglichen Strings dutch ein internes Sortierveffahren, und ffir die Stringverteilung auf Magnetplatten werden ansgewertet. Ein Algorithmns besehreibt die Konstruktion yon suboptimalen Mischb/iumen, die planbare Mischbgiume genannt werden. Ein Kostenmodell, das auf detaillierte Annahmen der Zuordnung yon k Eingabeplatten und der Planung eines r-Wege-Misehens beruht, wird far das exakte Vorplanen aufgestellt. Zeitbetrachtungen for Sortieren und Mischen, die Hardware-Eigenschaften yon Magnetplatten einschlieBen, zeigen signifikante Zeitgewinne verglichen mit weitverbreiteten Sortier-and Mischverfabren.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.