—Large numerical weather prediction (NWP) codes such as the Weather Research and Forecast (WRF) m... more —Large numerical weather prediction (NWP) codes such as the Weather Research and Forecast (WRF) model and the NOAA Nonhydrostatic Multiscale Model (NMM-B) port easily to Intel's Many Integrated Core (MIC) architecture. But for NWP to significantly realize MIC's one-to two-TFLOP/s peak computational power, we must expose and exploit thread and fine-grained (vector) parallelism while overcoming memory system bottlenecks that starve floating-point performance. We report on our work to improve the Rapid Radiative Transfer Model (RRTMG), responsible for 10-20 percent of total NMM-B run time. We isolated a standalone RRTMG benchmark code and workload from NMM-B and then analyzed performance using hardware performance counters and scaling studies. We restructured the code to improve vectorization, thread parallelism, locality, and thread contention. The restructured code ran three times faster than the original on MIC and, also importantly, 1.3x faster than the original on the host Xeon Sandybridge.
The interaction between the atmospheric boundary layer and wind turbines has become an important ... more The interaction between the atmospheric boundary layer and wind turbines has become an important issue with wind energy the fastest growing renewable energy resource worldwide, with increasingly large wind farms in development. The development of a new wind farm parameterization for the mesoscale numerical weather prediction model WRF provides a tool to improve understanding of the interaction between wind farms and the boundary layer. Wind turbines are represented as a sink of momentum and source of turbulence (turbulent kinetic energy) at model levels containing turbine blades. The parameterization can represent a wide range of turbines based on hub height, blade diameter, nominal power and cut-in/cut-out speeds. Results are presented for a series of idealized experiments which investigate the impact of large wind farms on the boundary layer. For an idealized offshore wind farm covering 10x10 km, the wind speed deficit was found to extend throughout the depth of the neutral bounda...
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09, 2009
... ucar.edu Manish Vachharajani University of Colorado at Boulder Boulder, CO [email protected]... more ... ucar.edu Manish Vachharajani University of Colorado at Boulder Boulder, CO [email protected] Adrian Sandu Virginia Polytechnic Institute and State University Blacksburg, VA [email protected] ABSTRACT This work ...
ABSTRACT The interaction between the atmospheric boundary layer and wind turbines has become an i... more ABSTRACT The interaction between the atmospheric boundary layer and wind turbines has become an important issue with wind energy the fastest growing renewable energy resource worldwide, with increasingly large wind farms in development. The development of a new wind farm parameterization for the mesoscale numerical weather prediction model WRF provides a tool to improve understanding of the interaction between wind farms and the boundary layer. Wind turbines are represented as a sink of momentum and source of turbulence (turbulent kinetic energy) at model levels containing turbine blades. The parameterization can represent a wide range of turbines based on hub height, blade diameter, nominal power and cut-in/cut-out speeds. Results are presented for a series of idealized experiments which investigate the impact of large wind farms on the boundary layer. For an idealized offshore wind farm covering 10x10 km, the wind speed deficit was found to extend throughout the depth of the neutral boundary layer. Downstream, the wake was found to decay with an e-folding length scale of 60 km. However, the turbulent kinetic energy generated within the farm was found to decay much more quickly downstream due to high dissipation within the farm. Above the farm to the top of the boundary layer, the turbulent kinetic energy was increased due to vertical transport and shear production caused by the momentum deficit within the farm. The turbulent kinetic energy was also increased near the surface below the turbines, causing an increase in the wind.
A new wind farm parameterization has been developed for the mesoscale numerical weather predictio... more A new wind farm parameterization has been developed for the mesoscale numerical weather prediction model, the Weather Research and Forecasting model (WRF). The effects of wind turbines are represented by imposing a momentum sink on the mean flow; transferring kinetic energy into electricity and turbulent kinetic energy (TKE). The parameterization improves upon previous models, basing the atmospheric drag of turbines on the thrust coefficient of a modern commercial turbine. In addition, the source of TKE varies with wind speed, reflecting the amount of energy extracted from the atmosphere by the turbines that does not produce electrical energy.
Page 1. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Automatic Generation of Multi-Cor... more Page 1. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Automatic Generation of Multi-Core Accelerated Chemical Kinetics for Simulation and Prediction John C. Linford, John Michalakes, Manish Vachharajani, and Adrian Sandu ...
We describe a parallel implementation of the nonhydrostatic version of the Penn State/NCAR Mesosc... more We describe a parallel implementation of the nonhydrostatic version of the Penn State/NCAR Mesoscale Model, MM5, that includes nesting capabilities. This version of the model can run on many di erent massively parallel computers (including a cluster of workstations). The model has been implemented and run on the IBM SP and Intel multiprocessors using a columnwise decomposition that supports irregularly shaped allocations of the problem to processors. This stategy will facilitate dynamic load balancing for improved parallel e ciency and promotes a modular design that simpli es the nesting problem. All data communication for nite di erencing, inter-domain exchange of data, and I/O is encapsulated within a parallel library, RSL. Hence, there are no sends or receives in the parallel model itself. The library is generalizable to other, similar nite di erence approximation codes. The code is validated by comparing the rate of growth in error between the sequential and parallel models with the error growth rate when the sequential model input is perturbed to simulate oating point rounding error. Series of runs on increasing numbers of parallel processors demonstrate that the parallel implementation is e cient and scalable to large numbers of processors.
Page 1. Automatic Generation of Multicore Chemical Kernels John C. Linford, John Michalakes, Mani... more Page 1. Automatic Generation of Multicore Chemical Kernels John C. Linford, John Michalakes, Manish Vachharajani, and Adrian Sandu Abstract—This work presents the Kinetics PreProcessor: Accelerated (KPPA), a general ...
50th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, 2012
Large-eddy simulations of atmospheric boundary layers under various stability and surface roughne... more Large-eddy simulations of atmospheric boundary layers under various stability and surface roughness conditions are performed to investigate the turbulence impact on wind turbines. In particular, the aeroelastic responses of the turbines are studied to characterize the fatigue loading of the turbulence present in the boundary layer and in the wake of the turbines. Two utility-scale 5-MW turbines that are separated by seven rotor diameters are placed in a 3 km by 3 km by 1 km domain. They are subjected to atmospheric turbulent boundary layer flow and data is collected on the structural response of the turbine components. The surface roughness was found to increase the fatigue loads while the atmospheric instability had a small influence. Furthermore, the downstream turbines yielded higher fatigue loads indicating that the turbulent wakes generated from the upstream turbines have significant impact.
Developments in Teracomputing - Proceedings of the Ninth ECMWF Workshop on the Use of High Performance Computing in Meteorology, 2001
... This paper reports on progress since our first ECMWF workshop paper on the WRF ... an importa... more ... This paper reports on progress since our first ECMWF workshop paper on the WRF ... an important milestone has been reached in the effort to develop an advanced mesoscale forecast ... and testing and verification over a range of applications including research and operational ...
Use of High Performance Computing in Meteorology - Proceedings of the Eleventh ECMWF Workshop, 2005
The first non-beta release of the Weather Research and Forecast (WRF) modeling system in May, 200... more The first non-beta release of the Weather Research and Forecast (WRF) modeling system in May, 2004 represented a key milestone in the effort to design and implement a fullyfunctioning, next-generation modeling system for the atmospheric research and operational NWP user communities. With efficiency, portability, maintainability, and extensibility as bedrock requirements, the WRF software framework has allowed incremental and reasonably rapid development while maintaining overall consistency and adherence to the architecture and its interfaces. The WRF 2.0 release supports the fullrange of functionality envisioned for the model including efficient scalable performance on a range of high-performance computing platforms, multiple dynamic cores and physics options, low-overhead two-way interactive nesting, moving nests, model coupling, and interoperability with other common model infrastructure efforts such as ESMF.
requirement of other NREL efforts that will be components of WESE such as the Gearbox Reliability... more requirement of other NREL efforts that will be components of WESE such as the Gearbox Reliability Collaborative project at NREL ).
Performance and portability are important but conflicting concerns in the development of the Weat... more Performance and portability are important but conflicting concerns in the development of the Weather Research and Forecast model, a next-generation community mesoscale model for numerical weather prediction and atmospheric research. Efficiency depends on the ability to realize significant percentages of peak processor performance, yet without hyper-engineering the codes for a particular brand or even type of CPU. This involves first developing a quantitative understanding how aspects of the software, in particular data and looping structures, affect performance and then engineering the code to enable flexible tuning of these aspects across a variety of different platforms in a single source code. This paper describes work to characterize the performance effects of WRF model loop and data structures on representative micro- and vector-processors.
—Large numerical weather prediction (NWP) codes such as the Weather Research and Forecast (WRF) m... more —Large numerical weather prediction (NWP) codes such as the Weather Research and Forecast (WRF) model and the NOAA Nonhydrostatic Multiscale Model (NMM-B) port easily to Intel's Many Integrated Core (MIC) architecture. But for NWP to significantly realize MIC's one-to two-TFLOP/s peak computational power, we must expose and exploit thread and fine-grained (vector) parallelism while overcoming memory system bottlenecks that starve floating-point performance. We report on our work to improve the Rapid Radiative Transfer Model (RRTMG), responsible for 10-20 percent of total NMM-B run time. We isolated a standalone RRTMG benchmark code and workload from NMM-B and then analyzed performance using hardware performance counters and scaling studies. We restructured the code to improve vectorization, thread parallelism, locality, and thread contention. The restructured code ran three times faster than the original on MIC and, also importantly, 1.3x faster than the original on the host Xeon Sandybridge.
The interaction between the atmospheric boundary layer and wind turbines has become an important ... more The interaction between the atmospheric boundary layer and wind turbines has become an important issue with wind energy the fastest growing renewable energy resource worldwide, with increasingly large wind farms in development. The development of a new wind farm parameterization for the mesoscale numerical weather prediction model WRF provides a tool to improve understanding of the interaction between wind farms and the boundary layer. Wind turbines are represented as a sink of momentum and source of turbulence (turbulent kinetic energy) at model levels containing turbine blades. The parameterization can represent a wide range of turbines based on hub height, blade diameter, nominal power and cut-in/cut-out speeds. Results are presented for a series of idealized experiments which investigate the impact of large wind farms on the boundary layer. For an idealized offshore wind farm covering 10x10 km, the wind speed deficit was found to extend throughout the depth of the neutral bounda...
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis - SC '09, 2009
... ucar.edu Manish Vachharajani University of Colorado at Boulder Boulder, CO [email protected]... more ... ucar.edu Manish Vachharajani University of Colorado at Boulder Boulder, CO [email protected] Adrian Sandu Virginia Polytechnic Institute and State University Blacksburg, VA [email protected] ABSTRACT This work ...
ABSTRACT The interaction between the atmospheric boundary layer and wind turbines has become an i... more ABSTRACT The interaction between the atmospheric boundary layer and wind turbines has become an important issue with wind energy the fastest growing renewable energy resource worldwide, with increasingly large wind farms in development. The development of a new wind farm parameterization for the mesoscale numerical weather prediction model WRF provides a tool to improve understanding of the interaction between wind farms and the boundary layer. Wind turbines are represented as a sink of momentum and source of turbulence (turbulent kinetic energy) at model levels containing turbine blades. The parameterization can represent a wide range of turbines based on hub height, blade diameter, nominal power and cut-in/cut-out speeds. Results are presented for a series of idealized experiments which investigate the impact of large wind farms on the boundary layer. For an idealized offshore wind farm covering 10x10 km, the wind speed deficit was found to extend throughout the depth of the neutral boundary layer. Downstream, the wake was found to decay with an e-folding length scale of 60 km. However, the turbulent kinetic energy generated within the farm was found to decay much more quickly downstream due to high dissipation within the farm. Above the farm to the top of the boundary layer, the turbulent kinetic energy was increased due to vertical transport and shear production caused by the momentum deficit within the farm. The turbulent kinetic energy was also increased near the surface below the turbines, causing an increase in the wind.
A new wind farm parameterization has been developed for the mesoscale numerical weather predictio... more A new wind farm parameterization has been developed for the mesoscale numerical weather prediction model, the Weather Research and Forecasting model (WRF). The effects of wind turbines are represented by imposing a momentum sink on the mean flow; transferring kinetic energy into electricity and turbulent kinetic energy (TKE). The parameterization improves upon previous models, basing the atmospheric drag of turbines on the thrust coefficient of a modern commercial turbine. In addition, the source of TKE varies with wind speed, reflecting the amount of energy extracted from the atmosphere by the turbines that does not produce electrical energy.
Page 1. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Automatic Generation of Multi-Cor... more Page 1. IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS 1 Automatic Generation of Multi-Core Accelerated Chemical Kinetics for Simulation and Prediction John C. Linford, John Michalakes, Manish Vachharajani, and Adrian Sandu ...
We describe a parallel implementation of the nonhydrostatic version of the Penn State/NCAR Mesosc... more We describe a parallel implementation of the nonhydrostatic version of the Penn State/NCAR Mesoscale Model, MM5, that includes nesting capabilities. This version of the model can run on many di erent massively parallel computers (including a cluster of workstations). The model has been implemented and run on the IBM SP and Intel multiprocessors using a columnwise decomposition that supports irregularly shaped allocations of the problem to processors. This stategy will facilitate dynamic load balancing for improved parallel e ciency and promotes a modular design that simpli es the nesting problem. All data communication for nite di erencing, inter-domain exchange of data, and I/O is encapsulated within a parallel library, RSL. Hence, there are no sends or receives in the parallel model itself. The library is generalizable to other, similar nite di erence approximation codes. The code is validated by comparing the rate of growth in error between the sequential and parallel models with the error growth rate when the sequential model input is perturbed to simulate oating point rounding error. Series of runs on increasing numbers of parallel processors demonstrate that the parallel implementation is e cient and scalable to large numbers of processors.
Page 1. Automatic Generation of Multicore Chemical Kernels John C. Linford, John Michalakes, Mani... more Page 1. Automatic Generation of Multicore Chemical Kernels John C. Linford, John Michalakes, Manish Vachharajani, and Adrian Sandu Abstract—This work presents the Kinetics PreProcessor: Accelerated (KPPA), a general ...
50th AIAA Aerospace Sciences Meeting including the New Horizons Forum and Aerospace Exposition, 2012
Large-eddy simulations of atmospheric boundary layers under various stability and surface roughne... more Large-eddy simulations of atmospheric boundary layers under various stability and surface roughness conditions are performed to investigate the turbulence impact on wind turbines. In particular, the aeroelastic responses of the turbines are studied to characterize the fatigue loading of the turbulence present in the boundary layer and in the wake of the turbines. Two utility-scale 5-MW turbines that are separated by seven rotor diameters are placed in a 3 km by 3 km by 1 km domain. They are subjected to atmospheric turbulent boundary layer flow and data is collected on the structural response of the turbine components. The surface roughness was found to increase the fatigue loads while the atmospheric instability had a small influence. Furthermore, the downstream turbines yielded higher fatigue loads indicating that the turbulent wakes generated from the upstream turbines have significant impact.
Developments in Teracomputing - Proceedings of the Ninth ECMWF Workshop on the Use of High Performance Computing in Meteorology, 2001
... This paper reports on progress since our first ECMWF workshop paper on the WRF ... an importa... more ... This paper reports on progress since our first ECMWF workshop paper on the WRF ... an important milestone has been reached in the effort to develop an advanced mesoscale forecast ... and testing and verification over a range of applications including research and operational ...
Use of High Performance Computing in Meteorology - Proceedings of the Eleventh ECMWF Workshop, 2005
The first non-beta release of the Weather Research and Forecast (WRF) modeling system in May, 200... more The first non-beta release of the Weather Research and Forecast (WRF) modeling system in May, 2004 represented a key milestone in the effort to design and implement a fullyfunctioning, next-generation modeling system for the atmospheric research and operational NWP user communities. With efficiency, portability, maintainability, and extensibility as bedrock requirements, the WRF software framework has allowed incremental and reasonably rapid development while maintaining overall consistency and adherence to the architecture and its interfaces. The WRF 2.0 release supports the fullrange of functionality envisioned for the model including efficient scalable performance on a range of high-performance computing platforms, multiple dynamic cores and physics options, low-overhead two-way interactive nesting, moving nests, model coupling, and interoperability with other common model infrastructure efforts such as ESMF.
requirement of other NREL efforts that will be components of WESE such as the Gearbox Reliability... more requirement of other NREL efforts that will be components of WESE such as the Gearbox Reliability Collaborative project at NREL ).
Performance and portability are important but conflicting concerns in the development of the Weat... more Performance and portability are important but conflicting concerns in the development of the Weather Research and Forecast model, a next-generation community mesoscale model for numerical weather prediction and atmospheric research. Efficiency depends on the ability to realize significant percentages of peak processor performance, yet without hyper-engineering the codes for a particular brand or even type of CPU. This involves first developing a quantitative understanding how aspects of the software, in particular data and looping structures, affect performance and then engineering the code to enable flexible tuning of these aspects across a variety of different platforms in a single source code. This paper describes work to characterize the performance effects of WRF model loop and data structures on representative micro- and vector-processors.
Uploads
Papers by J. Michalakes