Academia.eduAcademia.edu

A Novel Load-Balancing Algorithm for Distributed Systems

2013

Abstract

Distributed file systems are key building blocks for cloud computing applications based on the Map Reduce programming paradigm. Load balance among storage nodes is a critical function in clouds. In a load-balanced cloud, the resources can be well utilized and provisioned, maximizing the performance of Map Reduce-based applications. In such a distributed file system, the load of a node is typically proportional to the number of file chunks the node possesses. In this paper, a fully distributed load rebalancing algorithm is presented to cope with the load imbalance problem. The proposed algorithm is compared against a centralized approach in a production system and strives to balance the loads of nodes and reduce the demanded movement cost as much as possible, while taking advantage of physical network locality and node heterogeneity.