Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1998, Lecture Notes in Computer Science
The evaluation of parallel job schedulers hinges on two things: the use of appropriate metrics, and the use of appropriate workloads on which the scheduler can operate. We argue that the focus should be on on-line open systems, and propose that a standard workload should be used as a benchmark for schedulers. This benchmark will specify distributions of parallelism and runtime, as found by analyzing accounting traces, and also internal structures that create different speedup and synchronization characteristics. As for metrics, we present some problems with slowdown and bounded slowdown that have been proposed recently.
… Strategies for Parallel …, 1999
1 Department of Electrical Engineering and Computer Science Syracuse University, Syracuse, NY 13244-1240 [email protected] 2 Computer Science and Engineering Department University of California San Diego, La Jolla, CA 92093 [email protected] 3 Institute ...
Lecture Notes in Computer Science, 1997
The scheduling of jobs on parallel supercomputer is becoming the subject of much research. However, there is concern about the divergence of theory and practice. We review theoretical research in this area, and recommendations based on recent results. This is contrasted with a proposal for standard interfaces among the components of a scheduling system, that has grown from requirements in the eld.
Proceedings of the ACM on Measurement and Analysis of Computing Systems
To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip rather than single core performance. In turn, modern jobs are often designed to run on any number of cores. However, to effectively leverage these multi-core chips, one must address the question of how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is an obvious tradeoff: allocating more cores to an individual job reduces the job's runtime, but in turn decreases the efficiency of the overall system. We ask how the system should schedule jobs across cores so as to minimize the mean response time over a stream of incoming jobs. To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. EQUI requires jobs t...
Lecture Notes in Computer Science, 2008
The workshop on job scheduling strategies for parallel processing (JSSPP) studies the myriad aspects of managing resources on parallel and distributed computers. These studies typically focus on largescale computing environments, where allocation and management of computing resources present numerous challenges. Traditionally, such systems consisted of massively parallel supercomputers, or more recently, large clusters of commodity processor nodes. These systems are characterized by architectures that are largely homogeneous and workloads that are dominated by both computation and communication-intensive applications. Indeed, the large majority of the articles in the rst ten JSSPP workshops dealt with such systems and addressed issues such as queuing systems and supercomputer workloads.
We consider a parallel job scheduling model that in- corporates both computation time and communication overhead. For any job Jj with length pj, if kj processors are assigned to execute the job, then the actual execution time of the job is tj = pj=kj +(kj 1)c, where c is a con- stant overhead cost associated with each processor except the master processor that initiates the parallel computa- tion. Previously, it was shown that the Shortest Execution Time (SET) algorithm has competitive ratio 4(m 1)=m for even m 2 and 4m=(m + 1) for odd m 3 with respect to makespan. Here we study the Earliest Comple- tion Time (ECT) algorithm, and show that its competitive ratio is 2 and 2.25 on 2 and 3 processors, respectively. We also offer simulation results that show that ECT com- pares favorably to SET on larger numbers of processors. Finally, we show that any online algorithm for our prob- lem has competitive ratio at least 3/2 for arbitrarily large m.
Journal of Parallel and Distributed Computing
In the rapidly expanding field of parallel processing, job schedulers are the "operating systems" of modern big data architectures and supercomputing systems. Job schedulers allocate computing resources and control the execution of processes on those resources. Historically, job schedulers were the domain of supercomputers, and job schedulers were designed to run massive, long-running computations over days and weeks. More recently, big data workloads have created a need for a new class of computations consisting of many short computations taking seconds or minutes that process enormous quantities of data. For both supercomputers and big data systems, the efficiency of the job scheduler represents a fundamental limit on the efficiency of the system. Detailed measurement and modeling of the performance of schedulers are critical for maximizing the performance of a large-scale computing system. This paper presents a detailed feature analysis of 15 supercomputing and big data schedulers. For big data workloads, the scheduler latency is the most important performance characteristic of the scheduler. A theoretical model of the latency of these schedulers is developed and used to design experiments targeted at measuring scheduler latency. Detailed benchmarking of four of the most popular schedulers (Slurm, Son of Grid Engine, Mesos, and Hadoop YARN) are conducted. The theoretical model is compared with data and demonstrates that scheduler performance can be characterized by two key parameters: the marginal latency of the scheduler t s and a nonlinear exponent α s. For all four schedulers, the utilization of the computing system decreases to <10% for computations lasting only a few seconds. Multi-level schedulers (such as LLMapReduce) that transparently aggregate short computations can improve utilization for these short computations to >90% for all four of the schedulers that were tested.
2013
Abstract. In this work we analyze the performance of scheduling algorithms with respect to fairness. Existing works frequently consider fairness as a job related issue. In our work we analyze fairness with respect to different users of the system as this is a very important real-life problem. First, we discuss how fair are selected popular scheduling algorithms with respect to different users of the system. Next, we present an extension to the well known Conservative backfilling algorithm. Instead of “ad hoc” decisions, the schedule is now created subject to evaluation and optimization. Notably, the fairness is considered as an important metric, which accompanies standard performance related metrics such as slowdown or wait time. To achieve that, an inclusion of fairness as an optimization criterion is proposed. The new extension improves the performance and fairness of Conservative backfilling with respect to other classical techniques such as FCFS, EASY backfilling or aggressive b...
Today distributed server systems have been widely used in many areas because they enhance the computing power while being cost-effective and more efficient. Meanwhile, some novel scheduling strategies are employed to optimize the task assignment process. This project closely explored the performance of the novel scheduling strategies through computer simulation. The research was carried out regarding the simulation of a novel scheduling policy (Task Assignment by Guessing Size) and other two previous task assignment policies (Random and JSQ). The performance of novel scheduling strategy (TAGS) is achieved by comparing TAGS policy with other two preceding policies. To facilitate the performance, computer simulation is applied to perform the statistical measurements. The findings were, indeed, very interesting, showing that the novel scheduling strategy (TAGS) merely obtains an optimal performance under heavy-tail distributed computing environment. The paper concludes by summarizing t...
2015
Data-intensive batch jobs increasingly compete for resources with customer-facing online workloads in modern data centers. Today, the two classes of workloads run on separate infrastructures using different resource managers that pursue different objectives. Batch processing systems strive for coarse-grained throughput whereas online systems must keep the latency of fine-grained end-user requests low. Better resource management would allow both batch and online workloads to share infrastructure, reducing hardware and eliminating the inefficient and error-prone chore of creating and maintaining copies of data. This paper describes Facebook's Bistro, a scheduler that runs data-intensive batch jobs on live, customer-facing production systems without degrading the end-user experience. Bistro employs a novel hierarchical model of data and computational resources. The model enables Bistro to schedule workloads efficiently and adapt rapidly to changing configurations. At Facebook, Bist...
Proceedings. International Conference on Parallel Processing Workshop, 2002
Although there is wide agreement that backfilling produces significant benefits in scheduling of parallel jobs, there is no clear consensus on which backfilling strategy is preferable e.g. should conservative backfilling be used or the more aggressive EASY backfilling scheme; should a First-Come First-Served(FCFS) queue-priority policy be used, or some other such as Shortest job First(SF) or eXpansion Factor(XF); In this paper, we use trace-based simulation to address these questions and glean new insights into the characteristics of backfilling strategies for job scheduling. We show that by viewing performance in terms of slowdowns and turnaround times of jobs within various categories based on their width (processor request size), length (job duration) and accuracy of the user's estimate of run time, some consistent trends may be observed.
Lecture Notes in Computer Science, 2001
We develop a new metric for job scheduling that includes the effects of memory contention amongst simultaneously-executing jobs that share a given level of memory. Rather than assuming each job or process has a fixed, static memory requirement, we consider a general scenario wherein a process' performance monotonically increases as a function of allocated memory, as defined by a miss-rate versus memory size curve. Given a schedule of jobs in a shared-memory multiprocessor (SMP), and an isolated miss-rate versus memory size curve for each job, we use an analytical memory model to estimate the overall memory miss-rate for the schedule. This, in turn, can be used to estimate overall performance. We develop a heuristic algorithm to find a good schedule of jobs on a SMP that minimizes memory contention, thereby improving memory and overall performance.
1996
The problem considered in this thesis is how to run a workload of multiple parallel jobs on a single parallel machine. Jobs are assumed to be data-parallel with large degrees of parallelism, and the machine is assumed to have an MIMD architecture. We identify a spectrum of scheduling policies between the two extremes of time-slicing, in which jobs take turns to use the whole machine, and space-slicing, in which jobs get disjoint subsets of processors for their own dedicated use. Each of these scheduling policies is evaluated using a metric suited for interactive execution: the minimum machine power being devoted to any job, averaged over time. The following result is demonstrated. If there is no advance knowledge of job characteristics (such as running time, I/O frequency and communication locality) the best scheduling policy is gang-scheduling with instruction-balance. This conclusion validates some of the current practices in commercial systems. The proof uses the notions of clair...
2006
Multiprocessor scheduling in a shared multiprogramming environment is often structured as two-level scheduling, where a kernellevel job scheduler allots processors to jobs and a user-level task scheduler schedules the work of a job on the allotted processors. In this context, the number of processors allotted to a particular job may vary during the job's execution, and the task scheduler must adapt to these changes in processor resources. For overall system efficiency, the task scheduler should also provide parallelism feedback to the job scheduler to avoid the situation where a job is allotted processors that it cannot use productively.
Fairness is an important aspect in queuing systems. Several fairness measures have been proposed in queuing systems in general and parallel job scheduling in particular. Generally, a scheduler is considered unfair if some jobs are discriminated while others are favored. Some of the metrics used to measure fairness for parallel job schedulers can imply unfairness where there is no discrimination (and vice versa). This makes them inappropriate. In this paper, we show how the existing misrepresents fairness in practice. We then propose a new approach for measuring fairness for parallel job schedulers. Our approach is based on two principles: (i) since jobs have different resource requirements and find different queue/system states, they need not to have the same performance for the scheduler to be fair and (ii) to compare two schedulers for fairness, we make comparisons of how the schedulers favor/discriminate individual jobs. We use performance and discrimination trends to validate our approach. We observe that our approach can deduce discrimination more accurately. This is true even in cases where the most discriminated jobs are not the worst performing jobs.
Parallel Computing - Fundamentals and Applications - Proceedings of the International Conference ParCo99, 2000
This paper describes the collection and analysis of usage data for a large (hundreds of nodes) distributed memory machine over a period of 31 months during which 178,000 batch jobs were submitted. A number of data items were collected for each job, including the queue wait times, elapsed (wall clock) execution times, the number of nodes used, as well as the actual job CPU, system and wait times in node hours. This data set represents perhaps the most comprehensive such information on the use of a 100 Gflop parallel machine by a large (over 1,200 users in any given month) and diverse set of users. The results of this analysis provide some insights on how much machines are used and on workload profiles in terms of the number of nodes used, average queue wait times, elapsed and CPU times, as well as their distributions. A longitudinal analysis shows how these have changed over time and how scheduling policies affect user behavior. Some of these results confirm earlier studies, while others reveal new information. That knowledge has been used to develop a new scheduler for the system which has increased system node utilization from the 60% to the 90-95% range if there are sufficient jobs waiting in the queue.
ACM SIGMETRICS Performance Evaluation Review
To keep pace with Moore's law, chip designers have focused on increasing the number of cores per chip. To effectively leverage these multi-core chips, one must decide how many cores to assign to each job. Given that jobs receive sublinear speedups from additional cores, there is a tradeoff: allocating more cores to an individual job reduces the job's runtime, but decreases the efficiency of the overall system. We ask how the system should assign cores to jobs so as to minimize the mean response time over a stream of incoming jobs. To answer this question, we develop an analytical model of jobs running on a multi-core machine. We prove that EQUI, a policy which continuously divides cores evenly across jobs, is optimal when all jobs follow a single speedup curve and have exponentially distributed sizes. We also consider a class of "fixed-width" policies, which choose a single level of parallelization, k, to use for all jobs. We prove that, surprisingly, fixed-width policies which use the optimal fixed level of parallelization, k * , become near-optimal as the number of cores becomes large. In the case where jobs may follow different speedup curves, finding a good scheduling policy is even more challenging. In particular, EQUI is no longer optimal, but a very simple policy, GREEDY * , performs well empirically. For a full journal version of this paper see [2].
Lecture Notes in Computer Science, 2000
Buffered coscheduling is a new methodology that can substantially increase resource utilization, improve response time, and simplify the development of the run-time support in a parallel machine. In this paper, we provide an in-depth analysis of three important aspects of the proposed methodology: the impact of the communication pattern and type of synchronization, the impact of memory constraints, and the processor utilization. The experimental results show that if jobs use non-blocking or collectivecommunication patterns, the response time becomes largely insensitive to the job communication pattern. Using a simple job access policy, we also demonstrate the robustness of buffered coscheduling in the presence of memory constraints. Overall, buffered coscheduling generally outperforms backfilling and backfilling gang scheduling with respect to response time, wait time, run-time slowdown, and processor utilization.
2007
Scheduling competing jobs on multiprocessors has always been an important issue for parallel and distributed systems. The challenge is to ensure global, system-wide efficiency while offering a level of fairness to user jobs. Various degrees of successes have been achieved over the years. However, few existing schemes address both efficiency and fairness over a wide range of work loads. Moreover, in order to obtain analytical results, most of them require prior information about jobs, which may be difficult to obtain in real ...
1998
This paper analyzes job scheduling for parallel computers by using theoretical and experimental means. Based on existing architectures we rst present a m a c hine and a job model. Then, we propose a simple on-line algorithm employing job preemption without migration and derive theoretical bounds for the performance of the algorithm. The algorithm is experimentally evaluated with trace data from a large computing facility. These experiments show that the algorithm is highly sensitive on parameter selection and that substantial performance improvements over existing (non-preemptive) scheduling methods are possible.
2007
We present bu ered coscheduling, a new methodology to multitask parallel jobs in a message-passing environment and to develop parallel programs that can pave the way to the e cient implementation of a distributed operating system. Bu ered coscheduling is based on three innovative techniques: communication bu ering, strobing, and non-blocking communication. By leveraging these techniques, we can perform effective optimizations based on the global status of the parallel machine rather than on the limited knowledge available locally to each processor. The advantages of bu ered coscheduling include higher resource utilization, reduced communication overhead, e cient implementation of ow-control strategies and fault-tolerant protocols, accurate performance modeling, and a simpli ed yet still expressive parallel programming model. Preliminary experimental results show that bu ered coscheduling is very effective in increasing the overall performance in the presence of load imbalance and communication-intensive workloads.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.