Academia.eduAcademia.edu

Task Scheduling Algorithm with Fault Tolerance for Cloud

Abstract

Cloud computing is a paradigm that focuses on sharing of data and computation over a scalable network of nodes like end users, computers, data centers, and web services. Task scheduling is one of the most famous combinatorial optimization problems, and plays a key role to improve the performance of flexible and reliable systems. Cloud-based application services like social networking, web hosting, and content delivery, deal with large amount of data processing. These applications require large amount of network bandwidth because traffics between nodes are tremendous. As network bandwidth is a limited resource, scheduling policies that reduce bandwidth usage is essential in cloud computing. Task scheduling algorithms based on data locality will reduce the network access, thus reducing bandwidth usage and the job completion time. Balance Reduce Algorithm (BAR) is a heuristic algorithm based on data locality, and minimizes makespan (job completion time) of a job. This paper proposes an improved balance reduce algorithm, an enhancement of BAR algorithm for handling machine failure. For this purpose, we propose an algorithm which is similar to primary backup approach. Compared to existing BAR algorithm, this proposed algorithm will reduce the job completion time effectively when failure happens.