Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2015
In spring 2015, the Leibniz Supercomputing Centre (Leibniz-Rechenzentrum, LRZ), installed their new Peta-Scale System SuperMUC Phase2. Selected users were invited for a 28 day extreme scale-out block operation during which they were allowed to use the full system for their applications. The following projects participated in the extreme scale-out workshop: BQCD (Quantum Physics), SeisSol (Geophysics, Seismics), GPI-2/GASPI (Toolkit for HPC), Seven-League Hydro (Astrophysics), ILBDC (Lattice Boltzmann CFD), Iphigenie (Molecular Dynamic), FLASH (Astrophysics), GADGET (Cosmological Dynamics), PSC (Plasma Physics), waLBerla (Lattice Boltzmann CFD), Musubi (Lattice Boltzmann CFD), Vertex3D (Stellar Astrophysics), CIAO (Combustion CFD), and LS1-Mardyn (Material Science). The projects were allowed to use the machine exclusively during the 28 day period, which corresponds to a total of 63.4 million core-hours, of which 43.8 million core-hours were used by the applications, resulting in a ut...
2005
The last decade has witnessed a rapid proliferation of superscalar cache-based microprocessors to build high-end capability and capacity computers primarily because of their generality, scalability, and cost effectiveness. However, the constant degradation of superscalar sustained performance, has become a well-known problem in the scientific computing community. This trend has been widely attributed to the use of superscalar-based commodity components who's architectural designs offer a balance between memory performance, network capability, and execution rate that is poorly matched to the requirements of large-scale numerical computations. The recent development of massively parallel vector systems offers the potential to increase the performance gap for many important classes of algorithms. In this study we examine four diverse scientific applications with the potential to run at ultrascale, from the areas of plasma physics, material science, astrophysics, and magnetic fusion. We compare performance between the vector-based Earth Simulator (ES) and Cray X1, with leading superscalar-based platforms: the IBM Power3/4 and the SGI Altix. Results demonstrate that the ES vector systems achieve excellent performance on our application suite -the highest of any architecture tested to date.
2008 IEEE International Symposium on Parallel and Distributed Processing, 2008
This article appeared in a journal published by Elsevier. The attached copy is furnished to the author for internal non-commercial research and education use, including for instruction at the authors institution and sharing with colleagues. Other uses, including reproduction and distribution, or selling or licensing copies, or posting to personal, institutional or third party websites are prohibited. In most cases authors are permitted to post their version of the article (e.g. in Word or Tex form) to their personal website or institutional repository. Authors requiring further information regarding Elsevier's archiving and manuscript policies are encouraged to visit:
International Journal of High Performance Computing Applications, 2013
Advances in modeling and algorithms, combined with growth in computing resources, have enabled simulations of multiphysics multiscale phenomena that can greatly enhance our scientific understanding. However, on currently available HPC resources, maximizing the scientific outcome of simulations requires many trade-offs. In this paper we describe our experiences in running simulations of the explosion phase of Type Ia supernovae on the largest available platforms. The simulations use FLASH, a modular, adaptive mesh, parallel simulation code with a wide user base. The simulations use multiple physics components; e.g., hydrodynamics, gravity, a sub-grid flame model, a three-stage burning model, and a degenerate equation of state. They also use Lagrangian tracer particles, which are then post-processed to determine the nucleosynthetic yields. We describe the simulation planning process, and the algorithmic optimizations and trade-offs that were found to be necessary. Several of the optimizations and trade-offs were made during the course of the simulations as our understanding of the challenges evolved, or when simulations went into previously unexplored physical regimes. We also briefly outline the anticipated challenges of, and our preparations for, the next generation computing platforms.
SC16: International Conference for High Performance Computing, Networking, Storage and Analysis, 2016
2005
Columbia is a 10,240-processor supercluster consisting of 20 Altix nodes with 512 processors each, and currently ranked as one of the fastest computers in the world. In this paper, we present the performance characteristics of Columbia obtained on up to four computing nodes interconnected via the InfiniBand and/or NU-MAlink4 communication fabrics. We evaluate floatingpoint performance, memory bandwidth, message passing communication speeds, and compilers using a subset of the HPC Challenge benchmarks, and some of the NAS Parallel Benchmarks including the multi-zone versions. We present detailed performance results for three scientific applications of interest to NASA, one from molecular dynamics, and two from computational fluid dynamics. Our results show that both the NUMAlink4 and In-finiBand interconnects hold promise for multi-node application scaling to at least 2048 processors.
2020 IEEE/ACM 2nd Annual Workshop on Extreme-scale Experiment-in-the-Loop Computing (XLOOP), 2020
As data sets from DOE user science facilities grow in both size and complexity there is an urgent need for new capabilities to transfer, analyze and manage the data underlying scientific discoveries. LBNL's Superfacility project brings together experimental and observational research instruments with computational and network facilities at the National Energy Research Scientific Computing Center (NERSC) and the Energy Sciences Network (ESnet) with the goal of enabling user science. Here, we report on recent innovations in the Superfacility project, including advanced data management, API-based automation, real-time interactive user interfaces, and supported infrastructure for "edge" services.
Mathematics for Industry, 2015
We propose an open source infrastructure for development and execution of optimized and reliable simulation codes on post-petascale (pp) parallel computers with heterogeneous computing nodes which consist of multicore CPU's and accelerators., named "ppOpen-HPC". ppOpen-HPC consists of various types of libraries, which covers various types of procedures for scientific computations. Source code developed on a PC with a single processor is linked with these libraries, and generated parallel code is optimized for post-peta-scale system. Capability of automatic tuning is important and critical technology for further development on new architectures and maintenance of the framework.
Advances in Parallel Computing, 2004
We present the apeNEXT project which is currently developing a massively parallel computer with a multi-TFlops performance. Like previous APE machines, the new supercomputer is completely custom designed and is specifically optimized for simulating the theory of strong interactions, quantum chromodynamics (QCD). We assess the performance for key application kernels and make a comparison with other machines used for these kind of simulations. Finally, we give an outlook on future developments.
2010
Viktor K. Prasanna, University of Southern California (Editor) ... David A. Bader, Georgia Institute of Technology (Editor) ... Srinivas Aluru, Peter Athanas, Pavan Balaji, George Biros, Tony Brewer, Richard Brower, Steve Casselman, Andrew Chien, Steve Crago, Matt French, Maya Gokhale, Karthik Gomadam, Zhi Guo, Anshul Gupta, Jeff Hollingsworth, Volodymyr Kindratenko, ... Michael Mahoney, Sukarno Mertoguno, Henning Meyerhenke, Ken-ichi Nomura, Jason Riedy, Yogesh Simmhan, David Strenski, Prasanna Sundararajan, Rich ...
Astrophysical Journal Supplement Series, 2019
We describe the Outer Rim cosmological simulation, one of the largest high-resolution N-body simulations performed to date, aimed at promoting science to be carried out with large-scale structure surveys. The simulation covers a volume of (4.225Gpc) 3 and evolves more than one trillion particles. It was executed on Mira, a Blue-Gene/Q system at the Argonne Leadership Computing Facility. We discuss some of the computational challenges posed by a system like Mira, a many-core supercomputer, and how the simulation code, HACC, has been designed to overcome these challenges. We have carried out a large range of analyses on the simulation data and we report on the results as well as the data products that have been generated. The full data set generated by the simulation totals more than 5PB of data, making data curation and data handling a large challenge in of itself. The simulation results have been used to generate synthetic catalogs for large-scale structure surveys, including DESI and eBOSS, as well as CMB experiments. A detailed catalog for the LSST DESC data challenges has been created as well. We publicly release some of the Outer Rim halo catalogs, downsampled particle information, and lightcone data.
Benchmarking plays a central role in the evaluation of High Performance Computing architectures. Several benchmarks have been designed that allow users to stress various components of supercomputers. In order for the figures they provide to be useful, benchmarks need to be representative of the most common real-world scenarios. In this work, we introduce BSMBench, a benchmarking suite derived from Monte Carlo code used in computational particle physics. The advantage of this suite (which can be freely downloaded from http://www.bsmbench.org) over others is the capacity to vary the relative importance of computation and communication. This enables the tests to simulate various practical situations. To showcase BSMBench, we perform a wide range of tests on various architectures, from desktop computers to state-of-theart supercomputers, and discuss the corresponding results. Possible future directions of development of the benchmark are also outlined.
International Journal of Modern Physics A, 2005
The DØ experiment at Fermilab's Tevatron will record several petabytes of data over the next five years in pursuing the goals of understanding nature and searching for the origin of mass. Computing resources required to analyze these data far exceed capabilities of any one institution. Moreover, the widely scattered geographical distribution of DØ collaborators poses further serious difficulties for optimal use of human and computing resources. These difficulties will exacerbate in future high energy physics experiments, like the LHC. The computing grid has long been recognized as a solution to these problems. This technology is being made a more immediate reality to end users in DØ by developing a grid in the DØ Southern Analysis Region (DØSAR), DØSAR-Grid, using all available resources within it and a home-grown local task manager, McFarm. We will present the architecture in which the DØSAR-Grid is implemented, the use of technology and the functionality of the grid, and the experience from operating the grid in simulation, reprocessing and data analyses for a currently running HEP experiment.
2011
Multiscale, multirate scientific and engineering applications in the SciDAC portfolio possess resolution requirements that are practically inexhaustible and demand execution on the highest-capability computers available, which will soon reach the petascale. While the variety of applications is enormous, their needs for mathematical software infrastructure are surprisingly coincident; moreover the chief bottleneck is often the solver. At their current scalability limits, many applications spend a vast majority of their operations in solvers, due to solver algorithmic complexity that is superlinear in the problem size, whereas other phases scale linearly. Furthermore, the solver may be the phase of the simulation with the poorest parallel scalability, due to intrinsic global dependencies. This project brings together the providers of some of the world's most widely distributed, freely available, scalable solver software and focuses them on relieving this bottleneck for many specific applications within SciDAC, which are representative of many others outside. Solver software directly supported under TOPS includes: hypre, PETSc, SUNDIALS, SuperLU, TAO, and Trilinos. Transparent access is also provided to other solver software through the TOPS interface. The primary goals of TOPS are the development, testing, and dissemination of solver software, especially for systems governed by PDEs. Upon discretization, these systems possess mathematical structure that must be exploited for optimal scalability; therefore, application-targeted algorithmic research is included. TOPS software development includes attention to high performance as well as interoperability among the solver components. Support for integration of TOPS solvers into SciDAC applications is also directly supported by this proposal. The role of the UCSD PI in this overall CET, is one of direct interaction between the TOPS software partners and various DOE applications scientists-specifically toward magnetohydrodynamics (MHD) simulations with the Center for Extended Magnetohydrodynamic Modeling (CEMM) SciDAC and Applied Partial Differential Equations Center (APDEC) SciDAC, and toward core-collapse supernova simulations with the previous Terascale Supernova Initiative (TSI) SciDAC and in continued work on INCITE projects headed by Doug Swesty, SUNY Stony Brook. In addition to these DOE applications scientists, the UCSD PI works to bring leading-edge DOE solver technology to applications scientists in cosmology and large-scale galactic structure formation. Unfortunately, the funding for this grant ended after only two years of its five-year duration, in August 2008, due to difficulties at DOE in transferring the grant to the PI's new faculty position at Southern Methodist University. Therefore, this report only describes two years' worth of effort.
2018 IEEE 14th International Conference on e-Science (e-Science), 2018
With the growing computational complexity of science and the complexity of new and emerging hardware, it is time to re-evaluate the traditional monolithic design of computational codes. One new paradigm is constructing larger scientific computational experiments from the coupling of multiple individual scientific applications, each targeting their own physics, characteristic lengths, and/or scales. We present a framework constructed by leveraging capabilities such as in-memory communications, workflow scheduling on HPC resources, and continuous performance monitoring. This code coupling capability is demonstrated by a fusion science scenario, where differences between the plasma at the edges and at the core of a device have different physical descriptions. This infrastructure not only enables the coupling of the physics components, but it also connects in situ or online analysis, compression, and visualization that accelerate the time between a run and the analysis of the science content. Results from runs on Titan and Cori are presented as a demonstration.
2015
MASSIVE is the Australian specialised High Performance Computing facility for imaging and visualisation. The project is a collaboration between Monash University (lead), Australian Synchrotron (AS) and CSIRO, and it underpins a range of advanced instruments, including AS beamlines. This paper will report on the outcomes of the MASSIVE project since 2012, in particular focusing on instrument integration, and interactive access for analysis of synchrotron data. MASSIVE has developed a unique capability that supports an increasing number of researchers, including an instrument integration program to help facilities move data to an HPC environment and provide in-experiment data processing. This capability is best demonstrated at the AS Imaging and Medical Beamline (IMBL) where fast CT reconstruction and visualisation is essential to performing effective experiments. A workflow has been developed to integrate beamline allocations with HPC allocation providing visitors with access to a de...
Procedia Computer Science, 2015
The LSCP workshop focuses on symbolic and numerical methods and simulations, algorithms and tools (software and hardware) for developing and running large-scale computations in physical sciences. Special attention goes to parallelism, scalability and high numerical precision. System architectures are also of interest as long as they are supporting physics related calculations, such as: massively parallel systems, GPUs, many-integrated-cores, distributed (cluster, grid/cloud) computing, and hybrid systems. Topics are chosen from areas including: theoretical physics (high energy physics, nuclear physics, astrophysics, cosmology, quantum physics, accelerator physics), plasma physics, condensed matter physics, chemical physics, molecular dynamics, bio-physical system modeling, material science/engineering, nanotechnology, fluid dynamics, complex and turbulent systems, and climate modeling.
International Journal of Modern Physics A, 2005
In this review, the computing challenges facing the current and next generation of high energy physics experiments will be discussed. High energy physics computing represents an interesting infrastructure challenge as the use of large-scale commodity computing clusters has increased. The causes and ramifications of these infrastructure challenges will be outlined. Increasing requirements, limited physical infrastructure at computing facilities, and limited budgets have driven many experiments to deploy distributed computing solutions to meet the growing computing needs for analysis reconstruction, and simulation. The current generation of experiments have developed and integrated a number of solutions to facilitate distributed computing. The current work of the running experiments gives an insight into the challenges that will be faced by the next generation of experiments and the infrastructure that will be needed.
EPJ Web of Conferences
The field of fusion energy is about to enter the ITER era, for the first time we will have access to a device capable of producing 500 MW of fusion power, with plasmas lasting more than 300 seconds and with core temperatures in excess of 100-200 Million K. Engineering simulation for fusion, sits in an awkward position, a mixture of commercial and licensed tools are used, often with email driven transfer of data. In order to address the engineering simulation challenges of the future, the community must address simulation in a much more tightly coupled ecosystem, with a set of tools that can scale to take advantage of current petascale and upcoming exascale systems to address the design challenges of the ITER era.
Microprocessors and Microsystems
, [name].[surname]@polimi.it b ENEA, Italy, [name].[surname]@enea.it c Istituto per le Applicazioni del Calcolo (IAC), CNR, Rome, Italy, [name].[surname]@cnr.it d Istituto per le Applicazioni del Calcolo (IAC), CNR,
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.