GPU Acceleration of NWP: Benchmark Kernels Web Page


John Michalakes, National Center for Atmospheric Research

Manish Vachharajani, University of Colorado at Boulder



Increased computing power for weather, climate, and atmospheric science has provided direct benefits for defense, agriculture, the economy, the environment, and public welfare and convenience. Today, very large clusters with many thousands of processors are allowing scientists to move forward with simulations of unprecedented size. But time-critical applications such as real-time forecasting or climate prediction need strong scaling: faster nodes and processors, not more of them. Moreover, the need for good cost- performance has never been greater, both in terms of performance per watt and per dollar. For these reasons, the new generations of multi- and many-core processors being mass produced for commercial IT and "graphical computing" (video games) are being scrutinized for their ability to exploit the abundant fine- grain parallelism in atmospheric models.

We are working to identifying key computational kernels within the dynamics and physics of a large community NWP model, the Weather Research and Forecast (WRF) model. The goals are to (1) characterize and model performance of the kernels in terms of computational intensity, data parallelism, memory bandwidth pressure, memory footprint, etc. (2) enumerate and classify effective strategies for coding and optimizing for these new processors, (3) assess difficulties and opportunities for tool or higher-level language support, and (4) establish a continuing set of kernel benchmarks that can be used to measure and compare effectiveness of current and future designs of multi- and many-core processors for weather and climate applications. 

With the aim of fostering community interaction and effort, we invite and encourage for inclusion here: contributed results, implementations (including Cell, other GPUs, and multi-core), optimizations, new benchmark kernels, and links to pages presenting similar work. Please contact the authors at

Benchmark Kernels                             

The following kernels have been identified and set up as standalone benchmarks.  Click on the titles of each for additional information and status.

WRF Single Moment 5 Cloud Microphysics

WRF Fifth Order Positive Definite Tracer Advection

WRF-Chem KPP-generated Chemical-kinetics Solver

RRTM Longwave Radiation Physics

SWRAD Shortwave Radiation Physics

Additional References


Last updated, February 25, 2009