Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2001, Communications of the ACM
The computational science community has long been at the forefront of advanced computing, because of its persistent need to solve problems that require resources beyond those provided by the most powerful computers of the day. Examples of such high-end applications range from financial modeling and vehicle simulation to computational genetics and weather forecasting. Over the years, such considerations have led computational scientists to become aggressive and innovative adopters of vector computers, parallel systems, clusters, and other novel computing technologies. Recently, the widespread availability of high-speed networks and a growing awareness of the new problem solving modalities, made possible when these networks are used to couple geographically distributed resources, have stimulated interest in so-called Grid computing . The term "the Grid" refers to an emerging persistent infrastructure providing security, resource access, information, and other services that enable controlled and coordinated resource sharing among "virtual organizations" formed dynamically from individuals and organizations sharing a common interest . A variety of ambitious projects are applying Grid computing concepts to such challenging problems as the distributed analysis of experimental physics data, community access to earthquake engineering facilities, and the creation of "science portals"-thin clients providing remote access to the collection of information sources and simulation systems supporting a particular scientific discipline. Underpinning both parallel and Grid computing is a common need for coordination and communication mechanisms that allow multiple resources to be applied in a concerted fashion to complex problems. Scientific and engineering applications have, for the most part, addressed this requirement in an ad hoc and low-level fashion, using specialized message-passing libraries within parallel computers and various communication mechanisms among networked computers. While low-level approaches have allowed users to meet performance goals, an unfortunate consequence is that the computational science community has not benefited to any great extent from the significant advances in software engineering that have occurred during the past ten years in industry. In particular, the various benefits of Java, which seems ideal for multiparadigm communication environments, are hardly exploited at all. Java's platform-independent bytecode can be executed securely on many platforms, making Java an attractive basis for portable Grid computing. In addition, Java's performance on sequential codes, which is a strong prerequisite for the development of such "Grande" applications (see sidebar on Java Grande), has increased substantially over the past years . Inspired originally by coffee house jargon, the buzzword "Grande" has become commonplace in order to distinguish this emerging type of high-end applications when written in Java. 1 Furthermore, Java provides a sophisticated graphical user interface framework, as well as a paradigm to
Proceedings of the 2003 ACM symposium on Applied computing - SAC '03, 2003
A prototype Taskspaces framework for grid computing of scientific computing problems that require intertask communication is presented. The Taskspaces framework is characterized by three major design choices: decentralization provided by an underlying tuple space concept, enhanced direct communication between tasks by means of a communication tuple space distributed over the worker hosts, and object orientation and platform independence realized by implementation in Java. Grid administration tasks, for example resetting worker nodes, are performed by mobile agent objects. We report on large-scale grid computing experiments for iterative linear algebra applications showing that our prototype framework scales well for scientific computing problems that require neighbor-neighbor intertask communication. It is shown in a computational fluid dynamics simulation using a Lattice Boltzmann method that the Taskspaces framework can be used naturally in interactive collaboration mode. The scalable Taskspaces framework runs fully transparently on heterogeneous grids while maintaining a low complexity in terms of installation, maintenance, application programming and grid operation. It thus offers a promising roadway to push scientific grid computing with intertask communication beyond the experimental research setting.
Parallel Algorithms and Applications, 2004
The paper begins by considering what a Grid Computing Environment might be, why it is demanded, and how the authors' HPspmd programming fits into this picture. We then review our HPJava environment 1 as a contribution towards programming support for High-Performance Grid-Enabled Environments. Future grid computing systems will need to provide programming models. In a proper programming model for grid-enabled environments and applications, high performance on multi-processor systems is a critical issue. We describe the features of HPJava, including run-time communication library, compilation strategies, and optimization schemes. Through experiments, we compare HPJava programs against FORTRAN and ordinary Java programs. We aim to demonstrate that HPJava can be used "anywhere"-not only for high-performance parallel computing, but also for grid-enabled applications.
1998
Java may be a natural language for portable parallel programming. We discuss the basis of this claim in general terms, then illustrate the use of Java for message-passing and data-parallel programming through series of case studies. In the process we introduce some proposals for a Java binding of MPI, and describe the use of a Java class-library to implement HPF-style distributed data. Prospects for future Java-based parallel programming environments are discussed.
Future Generation Computer Systems, 2005
This paper presents JoiN, a Java-based software platform to construct massively parallel grids capable of executing large parallel applications. The system is designed to be scalable by allowing computers in the grid to be separated in independent sets (called groups) which are managed independently and collaborate using a logical interconnection topology. JoiN provides advanced fault tolerance capabilities that allow it to withstand failures both in computers executing parallel tasks and in computers managing the groups. The parallel applications executing in the system are formally specified using a rigorously defined application model. JoiN uses a dynamic, flexible scheduling algorithm that adapts to changes in resource availability and replicates parallel tasks for fault tolerance. The platform provides an authentication/access control mechanism based on roles which is embedded in the inner parts of the system. The software architecture is based on the concept of services, which are independent pieces of software that can be combined in several ways, providing the flexibility needed to adapt to particular environments. JoiN has been successfully used to implement and execute several parallel applications, such as DNA sequencing, Monte Carlo simulations and a version of the Traveling Salesman Problem.
2000
The aim of the Albatross project is to study applications and programming environments for computational grids consisting of multiple clusters that are connected by wide-area networks. Parallel processing on such systems is useful but challenging, given the large differences in latency and bandwidth between LANs and WANs. We provide efficient algorithms and programming environments that exploit the hierarchical structure of wide-area clusters to minimize communication over the WANs. In addition, we use highly efficient local-area communication protocols. We illustrate this approach using the Manta high-performance Java system and the MagPIe MPI library, both of which are implemented on a collection of four Myrinet-based clusters connected by wide-area ATM networks. Our sample applications obtain high speedups on this wide-area system.
The emergence of high speed wide area networks makes grid computing a reality. However grid applications that need reliable data transfer still have difficulties to achieve optimal TCP performance due to network tuning of TCP window size to improve bandwidth and to reduce latency on a high speed wide area network. This paper presents a Java package called JPARSS (Java Parallel Secure Stream (Socket)) that divides data into partitions that are sent over several parallel Java streams simultaneously and allows Java or Web applications to achieve optimal TCP performance in a grid environment without the necessity of tuning TCP window size. This package enables single sign-on, certificate delegation and secure or plaintext data transfer using several security components based on X.509 certificate and SSL. Several experiments will be presented to show that using Java parallel streams is more effective than tuning TCP window size. In addition a simple architecture using Web services to facilitate peer to peer and third party file transfer will be presented.
Concurrency and Computation: Practice and Experience, 2005
In computational grids, performance-hungry applications need to simultaneously tap the computational power of multiple, dynamically available sites. The crux of designing grid programming environments stems exactly from the dynamic availability of compute cycles: grid programming environments (a) need to be portable to run on as many sites as possible, (b) they need to be flexible to cope with different network protocols and dynamically changing groups of compute nodes, while (c) they need to provide efficient (local) communication that enables high-performance computing in the first place.
Future Generation Computer Systems, 2002
The aim of the Albatross project is to study applications and programming environments for computational Grids. We focus on high-performance applications, running in parallel on multiple clusters or MPPs that are connected by wide-area networks (WANs). We briefly present three Grid programming environments developed in the context of the Albatross project: the MagPIe library for collective communication with MPI, the replicated method invocation (RepMI) mechanism for Java, and the Java-based Satin system for running divide-and-conquer programs on Grid platforms.
2008
The int.eu.grid project aims at providing a production quality grid computing infrastructure for e-Science supporting parallel and interactive applications. The infrastructure capacity is presently about 750 cpu cores distributed over twelve sites in seven countries. These resources have to be tightly coordinated to match the requirements of parallel computing. Such an infrastructure implies high availability, performance and robustness resulting in a much larger management effort than in traditional grid environments which are usually targeted to run sequential non-interactive applications. To achieve these goals the int.eu.grid project offers advanced brokering mechanisms and user friendly graphical interfaces supporting application steering. The int.eu.grid environment is deployed on top of the J. Gomes et al. gLite middleware enabling full interoperability with existing gLite based infrastructures.
Java-based tuplespaces provide a new avenue of exploration for distributed computing with commodity technology. This paper presents early results from our investigation of JavaSpaces for scientic computation. We discuss weaknesses as revealed by our attempts to map existing parallel algorithms to the JavaSpaces model, and use low-level metrics to argue that several classes of problems are not eciently solvable in JavaSpaces, notably some that are demonstrably ecient with other tuplespace implementations. Nonetheless we argue that by addressing, o-the-shelf, numerous issues with distributed systems, JavaSpaces remains enticing and may merit use for high performance computation if viewed less strictly in the heritage of Linda and more as a platform-neutral code delivery mechanism for SPMD computing. To support this perspective we demonstrate parametric algorithms for which a JavaSpaces solution provides good speedups, relative not only to Java but also sequential C, and outline a dyna...
Future Generation Computer Systems, 2002
As the practice of science moves beyond the single investigator due to the complexity of the problems that now dominate science, large collaborative and multi-institutional teams are needed to address these problems.
2001
\Grande" applications are those with demanding CPU and I/O requirements. They originate in many disciplines, such as astrophysics, materials science, weather prediction nancial modeling, and data mining. Java has many features of interest to developers of such applications. At the same time, there are currently many barriers to the e ective use of Java i n t h i s w ay. The Java Grande Forum is a group of researchers and software developers from industry, academia, and government w i t h a n i n terest in the use of the Java programming language and environment for grande applications. The Forum seeks to increase awareness of issues important to this community, and to work towards their solution. In this article we describe the workings of the Java Grande Forum and the major issues that it has brought to the forefront. We o u tline approaches to the solutions of these problems, and describe e orts to standardize them within the larger Java community. Among the issues addressed are: oating-point performance, multidimensional arrays, complex arithmetic, fast object serializations, and high-speed remote method invocation (RMI). Java programmers do not need to explicitly deallocate blocks of memory. Instead, Java m a i n tains a garbage collector which automatically recovers unused storage.
Lecture Notes in Computer Science, 2005
The use of parallel computing and distributed information services is spreading quite rapidly, as today's difficult problems in science, engineering and industry far exceed the capabilities of the desktop PC and department file server. The availability of commodity parallel computers, ubiquitous networks, maturing Grid middleware, and portal frameworks is fostering the development and deployment of large scale simulation and data analysis solutions in many areas. This topic highlights recent progress in applications of high performance parallel and Grid computing, with an emphasis on successes, advances, and lessons learned in the development and implementation of novel scientific, engineering and industrial applications. Today's large computational solutions often operate in complex information and computation environments where efficient data access and management can be as important as computational methods and performance, so the technical approaches in this topic span high performance parallel computing, Grid computation and data access, and the associated problem-solving environments that compose and manage advanced solutions. This year the 23 papers submitted to this topic area showed a wide range of activity in high performance parallel and distributed computing, with the largest subset relating to genome sequence analysis. Nine papers were accepted as full papers for the conference, organized into three sessions. One session focuses on high performance genome sequence comparison. The second and third sessions present advanced approaches to scalable simulations, including some non-traditional arenas for high performance computing. Overall, they underscore the close relationship between advances in computer science, computational science, and applied mathematics in developing scalable applications for parallel and distributed systems.
Concurrency: Practice and Experience, 1997
Java offers the basic infrastructure needed to integrate computers connected to the Internet into a seamless parallel computational resource: a flexible, easily-installed infrastructure for running coarsegrained parallel applications on numerous, anonymous machines. Ease of participation is seen as a key property for such a resource to realize the vision of a multiprocessing environment comprising thousands of computers. We present Javelin, a Java-based infrastructure for global computing. The system is based on Internet software technology that is essentially ubiquitous: Web technology. Its architecture and implementation require participants to have access only to a Java-enabled Web browser. The security constraints implied by this, the resulting architecture, and current implementation are presented. The Javelin architecture is intended to be a substrate on which various programming models may be implemented. Several such models are presented: A Linda Tuple Space, an SPMD programming model with barriers, as well as support for message passing. Experimental results are given in the form of micro-benchmarks and a Mersenne Prime application that runs on a heterogeneous network of several parallel machines, workstations, and PCs.
2003
A computational and data grid developed at the Center for Comput ational Research in Buffalo, New York, will provide a heterogeneous platform to enable scientific and engineering applications to run in a Buffalo-centric grid-based setting. A proof-ofconcept heterogeneous grid has been developed using a critical scientific application in the field of structural biology. The design and functionality of the prototype grid web portal is described, along with plans for a production level grid system based on Globus. Several projects covering a collaborative expansion of this system are also summarized with respect to the core research being investigated. This expansion involves researchers located across the United States who are interested in analyzing and grid-enabling existing software applications and grid technology. increased memory and disk storage devices, a number of parallel languages, including P4, Express, and LINDA, began to appear [1,7]. See Table 1. During the early 1990s, desktop workstations began to be incorporated into distributed computing systems. Further, the capabilities of CPUs, memory, and disk storage increased rapidly during this period. Asynchronous Transfer Mode (ATM) used for Wide Area Networks (WAN) allowed networks to efficiently carry services of the future. Network and computer performance increased by 1000 times and standards such as Message Passing Interface (MPI), High Performance Fortran (HPF), and Distributed Computing Environment (DCE) began to emerge [8]. PVM. Parallel Virtual Machine (PVM) is a software package that permits a heterogeneous collection of Unix and/or Windows computers connected together by a network to be used as a single parallel computer. The first version of PVM was written during the summer of 1989 at Oak Ridge National Laboratory. This initial version of PVM was used internally and not released publicly. Based on the internal success of PVM, Version 2 of the code was redesigned and written from scratch during February 1991 at the lab's sister institution, the University of Tennessee, Knoxville. Version 2 of the code was publicly released in March of 1991. This version was intended to clean up and stabilize the system so that external users could reap the benefits of this parallel computing middleware. PVM Version 3 was redesigned from scratch, and a complete rewrite started in September 1992, with first release of the software in March 1993. While similar in spirit to version 2, version 3 includes features that did not fit the old framework, including fault tolerance, better portability and scalability. Three subsequent versions of PVM were released over the next 9 years. The current version of the system is entitled PVM Version 3.4.4, which was released in September of 2001. Concurrent development of XPVM provided a graphical interface to the PVM console commands and information, along with several animated views to monitor the execution of PVM programs. These views provide information about the interactions among tasks in a PVM program in order to assist in debugging and performance tuning. XPVM Version 1.0 was released in November of 1996 and the latest XPVM Version is 1.2.5, released April 1998 [2]. MPI. The specification of the Message Passing Interface (MPI) standard 1.0 [3] was completed in April of 1994. This was the result of a community effort to try and define both the syntax and semantics of a core message-passing library that would be useful to a wide range of users and implemented on a wide range of Massively Parallel Processor (MPP) platforms. Clarifications were made and released in June 1995, where the major goals were portability, high performance, "common practice", features process model, point-to-point communication, collective operations, and mechanisms for writing safe libraries. All major computer vendors supported the MPI standard and work began on MPI-2, where new functionality, dynamic process management, one-sided communication, cooperative I/O, C++ bindings, Fortran 90 additions, extended collective operations, and miscellaneous other functionality were added to the MPI-1 standard [4]. MPI-1.2 and MPI-2 were released at the same time in July of 1997. The main advantage of establishing a message-passing standard is portability. One of the goals of developing MPI is to provide MPP vendors with a clearly defined base set of routines that they can implement efficiently or, in some cases, improve scalability by providing hardware support. Local Area Multi-computer (LAM) development followed as an MPI programming environment and development system for heterogeneous computers on a network. LAM-MPI 6.1 was released in June 1998 and further development of a graphical user interface continued as XMPI 1.0 was released in January 1999. With LAM-MPI, a dedicated cluster or an existing network computing infrastructure can act as one parallel computer solving one problem and be monitored with a graphical user interface [5]. Development of Grid Computing Integrating computational resources into parallel and distributed systems has become common practice since the early 1990s. A (computational) grid can be defined as a computing system in which computational resources, including computing, storage, databases, devices, sensors, and tools, are organized into a cohesive distributed system that spans multiple geographic and administrative domains. In this section, we provide a very brief history of grid computing, focusing on the capabilities of several toolkits and software packages that are critical to the Center for Computational Research Grid (CCR-Grid) that is the subject of this paper. In order to provide a framework for the discussion that follows in this section, it is important to know that the CCR-Grid system will leverage many of the communication, authentication, concurrency, security, system monitoring, and error handling capabilities available in Globus, a critical public domain grid software package that has become the de facto standard in academic (and many industrial) settings. Globus. The Globus project was established in 1996. It focuses on enabling the application of various grid concepts, predominantly in the domain of computational science and engineering. The Globus project is centered at Argonne National Laboratory,
In this paper we present Java Package for Distributed Computing (J PDC), a toolkit for implementing and testing distributed algorithms in Java. J PDC's goals are to simplify the development of distributed algorithms by defining an highlevel programming interface. The interface is very close to the pseudo-code formalism commonly used to describe algorithms and allows, at the same time, the implementation and deployment in a truly distributed setting. Moreover, J PDC also provides a friendly interface that can be used both to visualize the algorithm behavior and both to interact with it. This is especially useful for teaching environments where complex algorithms can be much better debugged, understood and validated by implementing and running them.
2006
Special purpose high performance computers are expensive and rare, but workstation clusters are cheap and becoming common. Emerging technology offers the opportunity to integrate clusters into a single high performance computer-a computational Grid. The acceptance of computational Grids, however, is seriously hampered by the difficulty of efficiently managing the parallelism in such heterogeneous clusters, with characteristics radically different from a conventional high performance computer. To program this complex and dynamic architecture effectively we propose to use a language with high-level constructs, GpH, and to extend its runtime environment, Gum. The first contribution of this thesis is to develop GRID-G UM1, an initial port of Gum to computational Grids. Systematic evaluation shows that GRID-GUM1 delivers acceptable speedups on relatively low latency and on homogeneous computational Grids. However for high latency or heterogeneous computational Grids poor load scheduling limits performance. We next present an adaptive runtime environment GRID-G UM2, which includes monitoring mechanisms that determines static and dynamic properties of the underlying clusters and an adaptive scheduling mechanism that dynamically modifies parallel execution accordingly. To the best of our knowledge, GRID-GUM2 is one of the first fully implemented virtual shared memory runtime environment on the Grid. Evaluating GRID-GUM2's performance demonstrates that virtual shared memory is feasible on computational Grids and that it can deliver good speedups if combined with an aggressive dynamic load distribution mechanism. she gave me, and the simple but deep joys of sharing in Love. It is now the time to focus on people and colleagues who were more directly related with my PhD. I would like to thank my supervisor, Phil Trinder, for his guidance and time-efficiency in dealing with my thesis and also for the financial support which covered my tuition fees during my PhD. I am also indebted to Phil Trinder for the fruitful weekly supervision meeting which helped me to finish this thesis and made work motivation. I am very thankful to my supervisor Hans-Wolfgang Loidl who has been an invaluable person to discuss the implementation of GUM and many technical issues. Also I am much indebted again to my supervisor Greg Michaelson for providing me with hands-on advice research-wise; always supporting and encouraging. My office mates Ioannis, Andre, Xiao Yan, and Zara have also made work pleasant at lab G.59. I am very grateful to the people in the department of computing at Heriot-Watt University for providing valuable assistance. I am very thankful to my examiners David Duke and Peter King for their time and their useful comments. I am also grateful to many others, all of whom can not be named. And last but not least, to my parents who I register my profound gratitude to them who have trained me, taught me, amongst other things, to take life easy while encourage me to learn to face, on my own, the multi-faceted challenge of life. To my brothers, my sister and their families for all their emotional support, understanding and above all their limitless love. To my country which I have not seen (Palestine) and to my people who are suffering unlawful occupation. This thesis is therefore dedicated to them.
Proceedings 20th IEEE International Parallel & Distributed Processing Symposium, 2006
This paper describes the high-level execution and communication support provided in JGrid, a serviceoriented dynamic grid framework. One of its core services, the Compute Service, is the key component in creating dynamic computational grid systems that enable the execution of sequential and parallel interactive grid applications. A fundamental set of program execution modes supported by the service is described, then a programming model and its corresponding application programming interface is presented. The execution support of the service architecture is described in detail illustrating how remote evaluation and run-time task spawning are provided. The paper also shows in detail how task spawning and dynamic proxies can be used for a service-oriented communication mechanism for coarsegrain parallel grid applications.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.