Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2003, Tdx
…
107 pages
1 file
a su colaboración y apoyo ha sido posible la realización de esta Tesis. Y como no, a toda mi familia. A mis padres, hermanos, a mi mujer, Montse y a los pequeños, Sara y Roger, que han sabido entender en todo momento la dedicación que me ha supuesto la realización del proyecto.
Lecture Notes in Computer Science, 2003
Our research is focussed on keeping both local and parallel jobs together in a time-sharing NOW. In such systems, new scheduling approaches are needed to allocate the underloaded resources among parallel jobs. In this paper, a distributed system, named Cooperating CoScheduling, is presented to assign CPU and memory resources efficiently by applying resource balancing and dynamic coscheduling between parallel jobs. This is shown experimentally in a PVM-Linux NOW.
2004
In this paper, we conduct an in-depth evaluation of a broad spectrum of scheduling alternatives for clusters. These include the widely used batch scheduling, local scheduling, gang scheduling, all prior communication-driven coscheduling algorithms (Dynamic Coscheduling (DCS), Spin Block (SB), Periodic Boost (PB), and Co-ordinated Coscheduling (CC)) and a newly proposed HYBRID coscheduling algorithm on a 16-node, Myrinet-connected Linux cluster. Performance and energy measurements using several NAS, LLNL and ANL benchmarks on the Linux cluster provide several interesting conclusions. First, although batch scheduling is currently used in most clusters, all blocking-based coscheduling techniques such as SB, CC and HYBRID and the gang scheduling can provide much better performance even in a dedicated cluster platform. Second, in contrast to some of the prior studies, we observe that blocking-based schemes like SB and HYBRID can provide better performance than spin-based techniques like PB on a Linux platform. Third, the proposed HYBRID scheduling provides the best performance-energy behavior and can be implemented on any cluster with little effort. All these results suggest that blocking-based coscheduling techniques are viable candidates to be used in clusters for significant performance-energy benefits.
Our efforts are directed towards the understanding of the coscheduling mechanism in a NOW system when a parallel job is executed jointly with local workloads, balancing parallel performance against the local interactive response. Explicit and implicit coscheduling techniques in a PVM-Linux NOW (or cluster) have been implemented.
2001
Our research is focussed on keeping both local and parallel jobs together in a non-dedicated cluster or NOW (Network Of Workstations) and efficiently scheduling them by means of coscheduling mechanisms. A real implementation of a predictive coscheduling technique in a Linux cluster is presented in this article and its performance analyzed and compared with other coscheduling algorithms in the literature.
2009
Cluster computing are the best category of number of off-the-shelf commodity computers and resources that are integrated through hardware, networks and software to behave as a single computer simultaneously. In parallel applications, some processes are in need of executing simultaneously. We cannot be sure that all the processes are independent due to its communication behavior of some processes. Many of the processes are in need of co-scheduling each other. There are various types of co-scheduling available. This paper will focus mainly on the bandwidth and the memory concept mainly. This paper demands for the efficient resource utilization of cluster resources under the parallel execution of jobs using the newer bandwidth-aware coscheduling concept which is put forth here.
Big Data Management and Processing, 2017
IEEE Transactions on Parallel and Distributed Systems, 2000
Scheduling of processes onto processors of a parallel machine has always been an important and challenging area of research. The issue becomes even more crucial and difficult as we gradually progress to the use of off-the-shelf workstations, operating systems, and high bandwidth networks to build cost-effective clusters for demanding applications. Clusters are gaining acceptance not just in scientific applications that need supercomputing power, but also in domains such as databases, web service and multimedia, which place diverse Qualityof-Service (QoS) demands on the underlying system. Further, these applications have diverse characteristics in terms of their computation, communication and I/O requirements, making conventional parallel scheduling solutions, such as space sharing or coscheduling, unattractive. At the same time, leaving it to the native operating system of each node to make decisions independently can lead to ineffective use of system resources whenever there is communication. Instead, an emerging class of dynamic coscheduling mechanisms, that attempt to take remedial actions to guide the system towards coscheduled execution without requiring explicit synchronization, offers a lot of promise for cluster scheduling.
Proceedings of the 17th …, 2005
Scientific investigations have to deal with rapidly growing amounts of data from simulations and experiments. During data analysis, scientists typically want to extract subsets of the data and perform computations on them. In order to speed up the analysis, computations are performed on distributed systems such as computer clusters, or Grid systems. A well-known difficult problem is to build systems that execute the computations and data movement in a coordinated fashion. In this paper, we describe an architecture for executing co-scheduled tasks of computation and data movement on a computer cluster that takes advantage of two technologies currently being used in distributed Grid systems. The first is Condor, that manages the scheduling and execution of distributed computation, and the second is Storage Resource Managers (SRMs) that manage the space usage and content of storage systems. This is achieved by including the information about the availability of files on the nodes provided by SRMs into the advertised information that Condor uses for the purpose of matchmaking. The system is capable of dynamically load balancing by replicating popular files on idle nodes. To confirm the feasibility of our approach, a prototype system was built on a computer cluster. Several experiments based on real work logs were performed. We observed that without replication compute nodes are underutilized and job wait times in the scheduler's queue are longer. This architecture can be used in wide-area Grid systems since the basic components are already used for the Grid.
… on Computer Systems …, 2010
In this paper, we present an approach to scalable coscheduling in distributed computing for complex sets of interrelated tasks (jobs). The scalability means that schedules are formed for job models with various levels of task granularity, data replication policies, and the processor resource and memory can be upgraded. The necessity of guaranteed job execution at the required quality of service causes taking into account the distributed environment dynamics, namely, changes in the number of jobs for servicing, volumes of computations, possible failures of processor nodes, etc. As a consequence, in the general case, a set of versions of scheduling, or a strategy, is required instead of a single version. We propose a scalable model of scheduling based on multicriteria strategies. The choice of the specific schedule depends on the load level of the resource dynamics and is formed as a resource query which is sent to a local batch-job management system.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Job Scheduling Strategies for …, 2001
Job Scheduling Strategies for Parallel Processing, 2001
ACM SIGMETRICS Performance Evaluation Review, 2002
2011 40th International Conference on Parallel Processing Workshops, 2011
Parallel Algorithms and Applications, 2001
Proceedings 13th International Parallel Processing Symposium and 10th Symposium on Parallel and Distributed Processing. IPPS/SPDP 1999, 1999
Computing Research Repository, 2000
Journal of Systems Architecture, 1998