Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1994, Computer Physics Communications
…
14 pages
1 file
EGO is a parallel molecular dynamics program running on Transputers. We conducted a performance analysis of the EGO program in order to determine whether it was effectively using the computational resources of Transputers. Our first concern was whether communication was overlapped with computation, so that the overheads due to communication not overlapped with computation were less. With the assistance of performance tools such as UPSHOT, and with instrumentation of the EGO program itself, we were able to determine that only 8% of the execution time of the EGO program was spent in non-overlapping communication. Our next concern was that the MFLOPS rating of the EGO program was 0.25 MFLOPS, while the Transputers have a sustained rating of 1.5 MFLOPS. We measured MFLOPS ratings of small blocks of OCCAM code and determined that they matched the performance of the EGO code.
Journal of Optoelectronics and Advanced Materials
EGO is a parallel molecular dynamics program running on Transputers. We conducted a performance analysis of the EGO program in order to determine whether it was effectively using the computational resources of Transputers. Our first concern was whether communication was overlapped with computation, so that the overheads due to communication not overlapped with computation were less. With the assistance of performance tools such as UPSHOT, and with instrumentation of the EGO program itself, we were able to determine that only 8% of the execution time of the EGO program was spent in non-overlapping communication. Our next concern was that the MFLOPS rating of the EGO program was 0.25 MFLOPS, while the Transputers have a sustained rating of 1.5 MFLOPS. We measured MFLOPS ratings of small blocks of OCCAM code and determined that they matched the performance of the EGO code.
Wseas Transactions on Biology and Biomedicine, 2008
Molecular dynamics (MD) is one of the popular applications in the research field of high performance computing. Since it requires large amount of CPU time basically proportional to the square of the number of atoms simulated, acceleration of MD is essential to simulation of large biomolecules like proteins. Therefore, parallelization of MD has been actively studied long time. However, most of the studies of parallel MD report modified or newly developed algorithms specialized to some computer architectures like vector-parallel supercomputer, and an end-user of MD software cannot implement them to popular MD software developed by other ones. In this study, we evaluated performance of four kinds of computer architectures: 1) vector-parallel supercomputer, 2) multi-processor machine with shared memory, 3) multi-processor machine with distributed memory, and 4) PC cluster. Various compiler options for parallelization and optimization were tested. Experimental results revealed that if MD software is not parallelized nor vectorized in source level, use of normal PC cluster with maximum use of optimization options in compilation is the best way.
International Journal of Parallel Programming, 2002
The IBM Blue Gene project has endeavored,to develop a cellular architecture computer,with millions of concurrent threads of execution. One of the major challenges of this project is demonstr ating that applications can successfully exploit this massive amount of parallelism. Starting from the sequential version of a well known molecular dynamics code, we developed a new application that exploits the
2001
The IBM Blue Gene project has endeavored to develop a cellular architecture computer with millions of concurrent threads of execution. One of the major challenges of this project is demonstrating that applications can successfully exploit this massive amount of parallelism. Starting from the sequential version of a well known molecular dynamics code, we developed a new application that exploits the multiple levels of parallelism in the Blue Gene cellular architecture. We perform both analytical and simulation studies of the behavior of this application when executed on a very large number of threads. As a result, we demonstrate that this class of applications can execute efficiently on a large cellular machine.
Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2005
The effective exploitation of current high performance computing (HPC) platforms in molecular simulation relies on the ability of the present generation of parallel molecular dynamics code to make effective utilisation of these platforms and their components, including CPUs and memory. In this paper, we investigate the efficiency and scaling of a series of popular molecular dynamics codes on the UK's national HPC resources, an IBM p690C cluster and an SGI Altix 3700.
1993
This paper is concerned with the implementation of the molecular dynamics code, CHARMM, on massively parallel distributed-memory computer architectures using a data-parallel approach. The implementation is carried out by creating a set of software tools, which provide an interface between the parallelization issues and the sequential code. Large practical MD problems is solved on the Intel iPSC 860 hypercube. The overall solution e ciency is compared with that obtained when implementation is done using data-replication.
In modern parallel computing there exists a large number of highly different platforms and it is important to choose the right parallel platform for a computationally intensive code. To decide about the most cost effective parallel platform, a computer scientist needs a precise characterization of the application properties, such as memory usage, computation and communication requirements. The computer architects need this information to provide the community with new, more cost effective platforms. The precise resource usage is of little interest to the users or computational scientists in the applied field, so most traditional parallel codes are ill equipped to collect data about their resource usage and behavior at run time. Once an application runs, is numerically stable and there is some speedup, most computational scientists declare victory and move on to the next application. Contrary to that philosophy, our group of computer architects invested a considerable amount of effort and time to instrument Opal, a parallel molecular biology simulation code, for an accurate performance characterization on a given parallel platform. As a result we can present a simple analytical model for the execution time of that particular application code along with an in depth verification of that model through measured execution times. The model and measurements can not only be used for performance tuning but also for a good prediction of the code's performance on alternative platforms, like newer and cheaper parallel architectures or the next generation of supercomputers to be installed at our site.
… 2002, Abstracts and …, 2002
The molecular dynamics code CHARMM is a popular research tool for computational biology. An increasing number of researchers are currently looking for affordable and adequate platforms to execute CHARMM or similar codes.
2003
Some of the most challenging applications to parallelize scalably are the ones that present a relatively small amount of computation per iteration. Multiple interacting performance challenges must be identified and solved to attain high parallel efficiency in such cases. We present a case study involving NAMD, a parallel molecular dynamics application, and efforts to scale it to run on 3000 processors with Tera-FLOPS level performance. NAMD is implemented in Charm++, and the performance analysis was carried out using “projections”, the performance visualization/analysis tool associated with Charm++. We will showcase a series of optimizations facilitated by projections. The resultant performance of NAMD led to a Gordon Bell award at SC2002.
Theor Chem Acc, 1993
An account is given of experience gained in implementing computational chemistry application software, including quantum chemistry and macromolecular refinement codes, on distributed memory parallel processors. In quantum chemistry we consider the coarse-grained implementation of Gaussian integral and derivative integral evaluation, the direct-SCF computation of an uncorrelated wavefunction,~the 4-index transformation of two-electron integrals and the direct-CI calculation of correlated wavefunctions. In the refinement of macromolecular conformations, we describe domain decomposition techniques used in implementing general purpose molecular mechanics, molecular dynamics and free energy perturbation calculations. Attention is focused on performance figures obtained on the Intel iPSC/2 and iPSC/860 hypercubes, which are compared with those obtained on a Cray Y-MP/464 and Convex C-220 minisupercomputer. From this data we deduce the cost effectiveness of parallel processors in the field of computational chemistry.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
IBM Journal of Research and Development, 2000
Journal of …, 1999
Journal of Computational Physics, 1999
1994
Proceedings of the 18th Python in Science Conference
Journal of chemical …, 2000
IEEE Computational Science and Engineering, 1995
Mathematical and Computer Modelling, 2005
Journal of Computational Chemistry, 1997
Computer Physics Communications, 2002
International Journal of High Performance Computing Applications, 1996