WRF Software Testing
The testing conducted on the WRF code to insure bit-for-bit behavior on differing processor counts runs through hundreds of short forecasts in about an hour. These forecasts are very short (about 10 time steps). The purpose is to activate as many possible physics options. If single processor vs multiple processor results differ, then there is a strong likelihood that improper initialization of variables or missing communications or race conditions exist. While tracking down the root cause of the problem is extremely time consuming, physics options that exhibit clean bit-wise reproducible results are more likely to be robust. This testing is handled entirely with a newly developed mechanism designed to run on small desktops and on batch systems.
===========================
WRF Test Framework
===========================
1. Overview
The WRF Testing Framework is designed to build, test, and analyze test results for one or more versions of the WRF model. With the advent of NCAR's flagship mainframe, Yellowstone, in early 2013, the testing framework for WRF was rewritten to accommodate Yellowstone's additional flexibility. For example, a large number of Fortran compilers are available now, instead of just a single Fortran compiler that was available on NCAR's "bluefire" machine. Yellowstone's greater speed and greater job throughput has also allowed the number of tests performed for WRF to expand. Despite all of the additional variations available on Yellowstone, a complete WRF test run (compiling WRF, running tests, and analyzing results) often takes less than one hour on Yellowstone. WRF is now tested regularly for the following compilers, parallel processing configurations, compile-time variations, and run-time variations of the WRF software:
COMPILERS:
GNU fortran version 4.7.2
PGI fortran version 12.5
Intel fortran version 12.1.5
PARALLEL BUILD CONFIGURATIONS:
Serial (single-processor) build
OpenMP (multithreaded, shared memory) build
MPI (multiprocessor, distributed memory) build
WRF COMPILE-TIME VARIATIONS
ARW
NMM
NMM Nested
CHEM
CHEM with KPP
Idealized Super Cell
Idealized Baroclinc Wave
ARW em_real RUN-TIME VARIATIONS
Adaptive Time Stepping
Digital Filtering
FDDA
Grib 1 WRF Output
Binary WRF Output
Nesting
Quilting
Global Domain
2. Physics Options Applied in WRF Tests
The following Table and associated table Key summarizes the combinations of physics options that are tested for WRF. It is important to note that while the choice of one physics option should not influence the choice of another physics option (i.e., the choice of a microphysics scheme should be independent of the cumulus scheme choice), in practice certain options are developed and tested for a small subset of other physics option combinations. Therefore, the following table is useful as a guide for combinations of WRF physics options that are known to provide bit-for-bit results between serial and MPI versions of WRF. Each row in the table represents a specific test, and each column a specific physics option. All of the following tests are exercised using all three compilers on Yellowstone. Each of the physics combinations listed in the tables can be considered "safe" combinations that will provide successful short-term forecasts with bit-for-bit results when comparing single-processor output against multi-processor output.
TABLE 1:
WRF ARW Tests, Providing Successful 30-Minute Forecasts,
and Bit-for-Bit Results, on Serial vs. MPI Runs
NL PBL CU MP LW SW SFC LAND URB SHCU TOPO
global 1 1 3 1 1 1 1 0 0 0
01 1 1 1 1 1 1 1 0 0 1
02 1 2 4 3 3 1 4 0 0 0
02GR 1 2 4 3 3 1 4 0 0 0
03 4 3 3 4 4 4 1 0 0 0
03DF 4 3 3 4 4 4 1 0 0 0
03FD 4 3 3 4 4 4 1 0 0 0
05 7 5 5 5 5 7 7 0 0 0
05AD 7 5 5 5 5 7 7 0 0 0
05FD 7 5 5 5 5 7 7 0 0 0
06 8 6 6 4 4 2 1 0 0 0
06BN 8 6 6 4 4 2 1 0 0 0
07 8 14 7 7 7 1 2 2 0 0
07NE 8 14 7 7 7 1 2 2 0 0
08 9 7 8 5 5 2 3 0 0 0
09 6 1 9 3 3 5 3 0 0 0
09QT 6 1 9 3 3 5 3 0 0 0
10 4 2 10 1 2 4 7 0 0 0
12 8 3 16 4 4 1 2 3 0 0
12GR 8 3 16 4 4 1 2 3 0 0
13 9 7 13 1 1 2 3 0 2 0
14 4 6 3 3 3 4 3 0 0 0
15 5 14 2 5 5 1 7 0 0 0
15AD 5 14 2 5 5 1 7 0 0 0
16 10 14 4 5 5 10 7 0 0 0
16BN 10 14 4 5 5 10 7 0 0 0
16DF 10 14 4 5 5 10 7 0 0 0
17 2 2 4 3 3 2 2 0 0 0
17AD 2 2 4 3 3 2 2 0 0 0
19 1 1 4 1 2 1 5 0 0 0
20 12 1 4 1 2 1 2 0 0 0
20NE 12 1 4 1 2 1 2 0 0 0
25 1 1 1 1 1 11 1 0 0 1
26 2 1 1 1 1 3 1 0 0 0
29 9 3 4 1 2 1 5 0 2 0
29QT 9 3 4 1 2 1 5 0 2 0
30 2 93 4 1 1 2 1 0 2 0
31 7 2 14 3 3 7 1 0 0 0
31AD 7 2 14 3 3 7 1 0 0 0
32 9 7 11 3 3 1 5 0 2 0
33 9 7 11 4 4 1 5 0 2 0
34 9 7 11 4 4 1 2 0 2 0
35 9 7 11 3 3 1 2 0 2 0
37 9 7 11 4 4 2 2 0 2 0
38 5 14 2 5 5 2 7 0 0 0
38AD 5 14 2 5 5 2 7 0 0 0
39 5 14 2 5 5 5 7 0 0 0
39AD 5 14 2 5 5 5 7 0 0 0
40 7 2 14 3 3 7 1 0 0 0
41 2 2 4 3 3 2 2 0 0 0
42 4 2 10 1 2 4 7 0 0 0
KEY 1: Column Labels (Tables 1-4)
----------------------------------------
NL => Test Namelist Identifier
PBL => Planetary Boundary Layer Scheme
CU => Cumulus Scheme
MP => Microphysics Scheme
LW => Longwave Radiation Scheme
SW => Shortwave Radiation Scheme
SFC => Surface Physics Scheme
LAND => Land Surface Scheme
URB => Urban Physics Scheme
SHCU => Shallow Cumulus Scheme
TOPO => Topography-Following Wind Scheme
----------------------------------------
KEY 2: Test Namelist Codes (Tables 1-4)
----------------------------------------
AD => Adaptive Time Stepping
BN => Binary WRF Output
DF => Digital Filtering
FD => FDDA
GR => Grib 1 WRF Output
NE => Basic Nesting
QT => Quilting
----------------------------------------
TABLE 2:
WRF ARW Tests, Providing Successful 30-Minute Forecasts,
and Bit-for-Bit Results, on Serial vs. OpenMP Runs
NL PBL CU MP LW SW SFC LAND URB SHCU TOPO
global 1 1 3 1 1 1 1 0 0 0
03 4 3 3 4 4 4 1 0 0 0
03DF 4 3 3 4 4 4 1 0 0 0
03FD 4 3 3 4 4 4 1 0 0 0
06 8 6 6 4 4 2 1 0 0 0
06BN 8 6 6 4 4 2 1 0 0 0
07 8 14 7 7 7 1 2 2 0 0
07NE 8 14 7 7 7 1 2 2 0 0
08 9 7 8 5 5 2 3 0 0 0
10 4 2 10 1 2 4 7 0 0 0
14 4 6 3 3 3 4 3 0 0 0
16 10 14 4 5 5 10 7 0 0 0
16BN 10 14 4 5 5 10 7 0 0 0
16DF 10 14 4 5 5 10 7 0 0 0
17 2 2 4 3 3 2 2 0 0 0
17AD 2 2 4 3 3 2 2 0 0 0
20 12 1 4 1 2 1 2 0 0 0
20NE 12 1 4 1 2 1 2 0 0 0
31 7 2 14 3 3 7 1 0 0 0
31AD 7 2 14 3 3 7 1 0 0 0
38 5 14 2 5 5 2 7 0 0 0
40 7 2 14 3 3 7 1 0 0 0
41 2 2 4 3 3 2 2 0 0 0
42 4 2 10 1 2 4 7 0 0 0
TABLE 3:
WRF Idealized Supercell Tests, Providing Successful 30-Minute Forecasts,
and Bit-for-Bit Results, on Serial vs. Non-Serial Runs (OpenMP and MPI)
NL PBL CU MP LW SW SFC LAND URB SHCU TOPO
01 0 0 1 0 0 0 0 0 0 0
01NE 0 0 1 0 0 0 0 0 0 0
02 0 0 1 0 0 1 0 0 0 0
02NE 0 0 1 0 0 1 0 0 0 0
03 0 0 1 0 0 1 0 0 0 0
03NE 0 0 1 0 0 1 0 0 0 0
04 0 0 2 0 0 1 0 0 0 0
04NE 0 0 2 0 0 1 0 0 0 0
05 0 0 2 0 0 1 0 0 0 0
05NE 0 0 2 0 0 1 0 0 0 0
06 0 0 18 0 0 1 0 0 0 0
06NE 0 0 18 0 0 1 0 0 0 0
07 0 0 17 0 0 0 0 0 0 0
08 0 0 18 0 0 1 0 0 0 0
09 0 0 19 0 0 1 0 0 0 0
10 0 0 21 0 0 1 0 0 0 0
TABLE 4:
WRF Idealized B-Wave Tests, Providing Successful 30-Minute Forecasts,
and Bit-for-Bit Results, on Serial vs. Non-Serial Runs (OpenMP and MPI)
NL PBL CU MP LW SW SFC LAND URB SHCU TOPO
1 0 0 1 0 0 0 0 0 0 0
1NE 0 0 1 0 0 0 0 0 0 0
2 0 0 1 0 0 0 0 0 0 0
2NE 0 0 1 0 0 0 0 0 0 0
3 0 0 2 0 0 0 0 0 0 0
3NE 0 0 2 0 0 0 0 0 0 0
4 0 0 2 0 0 0 0 0 0 0
4NE 0 0 2 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0
5NE 0 0 0 0 0 0 0 0 0 0
Currently, the list of MPI parallel tests able to pass bit-for-bit comparisons is much larger than OpenMP parallel tests. While there are a few MPI tests that are able to work for some compilers and not others (such as NSSL microphysics), there are quite a large number of OpenMP tests that behave differently based on the compiler. Here is a brief list of the tested OpenMP options for ARW em_real that give bit-for-bit results for specific compilers:
PGI 12.5
03 03DF 03FD 06 06BN 07 07NE 08 10 14 16 16BN 16DF 17 17AD 20 20NE 31 31AD 37 38 40 41 42 global
GNU 4.7.2
03 03DF 03FD 06 06BN 07 07NE 08 10 14 15 15AD 16 16BN 16DF 17 17AD 20 20NE 31 31AD 34 38 38AD 39 39AD 40 41 42 global
Intel 12.1.5
01 02 02GR 03 03DF 03FD 06 06BN 07 07NE 08 10 12 12GR 13 14 15 15AD 16 16BN 16DF 17 17AD 19 20 20NE 25 26 29 29QT 30 31 31AD 33 34 35 38 38AD 39 39AD 40 41 42 global