Model Options¶
Beyond the basic process of running a global simulation with standard output files outlined in Running MPAS-Atmosphere, the MPAS-Atmosphere model provides several options that can be described in terms of variations on the basic simulation workflow. In the sections that follow, major model options are described in terms of the deviation from the basic global simulation process.
Periodic SST and Sea-ice Updates¶
The stand-alone MPAS-Atmosphere model is not coupled to fully prognostic ocean or sea-ice models, and accordingly, the model SST and sea-ice fraction fields generally do not change over the course of a simulation. For simulations shorter than a few days, invariant SST and sea-ice fraction fields are generally not problematic. However, for longer model simulations, it is typically recommended to periodically update the SST and sea-ice fields from an external file.
The surface data to be used for periodic SST and sea-ice updates could originate from any number of sources, though the most straightforward way to obtain a dataset in a usable format is to process GRIB data (e.g., GFS GRIB data) using the ungrib program from the WRF model’s pre-processing system (WPS). See detailed instructions for building the WPS and running the WPS, including the process of generating intermediate data files from GFS data.
The following steps summarize the process of generating an SST and sea-ice update file, surface.nc, using the init_atmosphere_model program:
Include surface data intermediate files in the working directory
Include a static.nc file in the working directory (see Static Fields)
If running in parallel, include a graph.info.part.* in the working directory (see Graph Partitioning with METIS)
Edit the namelist.init_atmosphere configuration file (see below)
Edit the streams.init_atmosphere I/O configuration file (described below)
Run init_atmosphere_model to create surface.nc
&nhyd_model
config_init_case = 8 |
must be 8 - the surface field initialization case |
config_start_time = ‘2010-10-23_00:00:00’ |
time to begin processing surface data |
config_stop_time = ‘2010-10-30_00:00:00’ |
time to end processing surface data |
&data_sources
config_sfc_prefix = ‘SST’ |
the prefix of the intermediate data files containing SST and sea-ice |
config_fg_interval = 86400 |
interval between intermediate files to use for SST and sea-ice |
&preproc_stages
config_static_interp = false |
only the input_sst and frac_seaice stages should be enabled |
&decomposition
config_block_decomp_file_prefix = ‘graph.info.part.’ |
if running in parallel, this needs to match the grid decomposition file prefix |
After editing the namelist.init_atmosphere file, the names of the static file and the surface update file to be created, must be set in the XML I/O configuration file, streams.init_atmosphere. Specifically,
the filename_template attribute must be set to the name of the static file in the “input” stream definition,
the filename_template attribute must be set to name of the surface update file to be created in the “surface” stream definition.
ensure that output_interval is set to the interval at which the surface intermediate files are provided
Regional Simulation¶
Beginning in MPAS v7.0 is the capability to run simulations over regional domains on the surface of the sphere. Setting up and running a limited-area simulation requires, as a starting point, a limited-area SCVT mesh . Given a limited-area mesh, the key differences from a global simulation arethat for regional simulations:
blending the MPAS terrain field with the “first-guess” terrain data along the boundaries of the limited-area domain;
generate a set of files containing lateral boundary conditions (LBCs); and
apply LBCs during the model integration.
Terrain Blending¶
Terrain blending takes place when generating the limited-area initial conditions, which are prepared as in Vertical Grid Generation and Initial Field Interpolation, except that the config_blend_bdy_terrain option should be set to true in the namelist.init_atmosphere file. This instructs the init_atmosphere_model program to perform averaging of the model terrain field from the static.nc file with the terrain field from the atmospheric initial conditions dataset along the lateral boundaries of the mesh.
LBC File Generation¶
LBC file generation requires running init_atmosphere one additional time, with namelist options set as described below.
&nhyd_model
config_init_case = 9 |
the LBC’s processing case |
config_start_time = ‘2010-10-23_00:00:00’ |
time to begin processing LBC data |
config_stop_time = ‘2010-10-30_00:00:00’ |
time to end processing LBC data |
&dimensions
config_nfglevels = 38 |
number of vertical levels in the intermediate file |
&data_sources
config_met_prefix = ‘GFS’ |
the prefix of the intermediate data files to be used for LBCs |
config_fg_interval = 10800 |
interval between intermediate files |
&decomposition
config_block_decomp_file_prefix = ‘graph.info.part.’ |
if running in parallel, this needs to match the grid decomposition file prefix |
When processing LBCs,
the output_interval for the “lbc” stream in the streams.init_atmosphere file must match the value of config_fg_interval in the namelist.init_atmosphere file.
the file to be read by the “input” stream must contain vertical grid information; typically, the model initial-conditions file can be used as the source for the “input” stream.
Following this step, a set of netCDF files containing LBCs for the model integration are produced.
Application of LBCs During Model Integration¶
To apply LBCs during the model integration,
set config_apply_lbcs to true in the model’s namelist.atmosphere file
set the input_interval for the “lbc_in” stream in the streams.atmosphere file to match the interval at which the LBC netCDF files were produced.
Separate Stream for Invariant Fields¶
By default, the MPAS-Atmosphere model reads time-invariant fields (e.g., latCell, lonCell, areaCell, zgrid, zz, etc.) from the “input” and “restart” streams (for cold-start and restart runs, respectively), and it writes time-invariant fields to the “restart” stream. In the case of large ensembles, the time-invariant fields replicated in the restart files for all ensemble members can account for a substantial amount of storage. Since these time-invariant fields do not change in time or across ensemble members, only one copy of these fields needs to be stored.
MPAS-Atmosphere v8.1.0 introduces a capability to omit time-invariant fields from model restart files. When the model restarts, a new “invariant” stream may be used to read time-invariant fields from a separate file, and many ensemble members can share this file.
To make use of the “invariant” stream, several changes to the standard MPAS-Atmosphere workflow are needed.
Preparing an Invariant File¶
Through the use of the init_atmosphere model program, a file containing all required time-invariant fields must be prepared. Since the model initial conditions file (typically init.nc) contains time-invariant fields, the initial conditions file from any ensemble member may be used.
If a file containing purely time-invariant fields is desired, run the following pre-processing stages, and then the output from the init_atmosphere_model program will suffice:
config_static_interp = true
config_native_gwd_static = true
config_vertical_grid = true
Note
These pre-processing stages do not need to be run all at once. It is possible, for example, to first produce a static.nc file using the first two of these pre-processing stages, and to then produce an invariant file (e.g., invariant.nc) by running the vertical grid generation stage using the static.nc file as input.
Activating the Invariant Stream¶
When running the model itself (atmosphere_model), the use of the new invariant stream may be activated by defining the “invariant” immutable stream in the streams.atmosphere file as follows:
<immutable_stream name="invariant"
type="input"
filename_template="invariant.nc"
input_interval="initial_only" />
In the definition of the “invariant” stream, filename_template should be set to the actual name of the invariant file.
When the “invariant” stream exists in streams.atmosphere, the model omits all time-invariant fields from any restart files that are written. When the model restarts, all time-invariant fields are read from the “invariant” stream rather than from the “restart” stream.
Large Eddy Simulation (LES)¶
MPAS-Atmosphere Version 8.4 contains the first release of a Large-Eddy Simulation (LES) capability. Similar to WRF, it contains two subgrid turbulence models — a diagnostic Turbulent Kinetic Energy (TKE) formulation based on a 3D Smagorinsky formulation, and a 1.5-order prognostic TKE formulation. These formulations generally follow the implementation in WRF (see the WRF Technical Note Version 4, sections 4.2.3 and 4.2.4, for a description of the formulations). The MPAS-A implementation differs from WRF in the following ways:
horizontal derivatives are taken along coordinate surfaces, and thus, the MPAS LES formulation does not take into account sloping coordinate surfaces (i.e., terrain effects); and
the vertical diffusion operators are integrated explicitly.
Additionally, beginning with MPAS-A v8.4.0, the application of LES mixing, ‘2d_smagorinsky’, and the background 4th-order filter can be enabled for scalar variables (qv , etc.) by setting config_mix_scalars to true (default is false). Previously, the 2d_smagorinsky’ filtering and the background 4th-order filtering were not applied to scalar variables in the time integration because the monotonic transport scheme provides sufficient filtering.
Note that the default filtering configuration for MPAS-A v8.4 is unchanged from previous versions — config_horiz_mixing = ‘2d_smagorinsky’, and a 4th-order horizontal background filter is active for the dry dynamics variables \(u\), \(w\), and \(\theta_{m}\).
Preparing idealized LES initial conditions¶
Beginning with MPAS-Atmosphere v8.4.0, the init_atmosphere core provides a new idealized LES initialization case. This case uses a 48-m hexagonal grid on square flat plane with periodic boundary conditions, 100 vertical levels with a 3-km top, and a uniform vertical spacing, i.e., ∆z = 30 m. In the namelist.init_atmosphere file, the following namelist options must be set:
config_init_case = 10
config_ztop = 3000.0
config_nvertlevels = 100
The initial state is specified in the source file src/core_init_atmosphere/mpas_init_atm_cases.F, where the subroutine init_atm_case_les calls atm_get_sounding for \(\theta\), \(q_{v}\) , \(u\), and \(v\). Other details of the case are as follows:
SAS (Southeast Atmosphere Study) Case
\(\theta\) 296.6 \(K\) up to 352.5 \(m\), increasing to 298.1 \(K\) at 442.5 \(m\), then at a rate of 3 \(K/km\) to the model top
\(q_{v}\) 11.8 \(g/kg\) up to 352.5 \(m\), decreasing to 7.8 \(g/kg\) at 442.5 \(m\), then at rate of −4 \(g/kg/km\) to zero at 2.4 \(km\)
\(u = 2 m/s\), \(v = 0 m/s\) at all levels (with assumed geostrophic u_init and v_init)
Coriolis set to 7.2921e-05
θ perturbations in the PBL (397 \(m\)) ±0.5 \(K\)
TKE scalar initialized 0.4 \(m^2/s^2\) at the surface, decreasing rapidly to zero at 255 \(m\) (as specified for the SAS case)
Model configuration for LES¶
The LES capability is controlled by the namelist configuration variable config_les_model. Choosing either ‘3d_smagorinsky’ or ‘prognostic_1.5_order’ disables ‘2d_smagorinsky’ even if config_horiz_mixing = ‘2d_smagorinsky’ is set in the namelist. The 4th-order horizontal background filter is still enabled when using either of the LES schemes. The 4th-order filter can be controlled by setting config_visc4_2dsmag (default value 0.05). Setting this value to zero disables the 4th-order filter.
Other namelist parameters associated with the LES models include:
config_les_surface
‘none’ — (default) no surface effect
‘specified’ — use fixed values from config_surface_* inputs below
‘varying’ — use inputs from physics surface heat flux, moisture flux and friction velocity
config_surface_heat_flux = real (\(K m s^{−1})\), 0.0 (default)
\(w^\prime \theta^\prime\) at surfaceconfig_surface_moisture_flux = real (\(kg/kg m s^{−1}\)), 0.0 (default)
\(w^\prime q^\prime\) at surfaceconfig_surface_drag_coefficient = real (unitless), 0.0 (default)
\(C_d\) defined from lowest level \(V\) such that surface stress is given as \(\rho \times C_d \times V^2\)
Model runs on GPUs¶
Beginning with MPAS v8.3.0, a GPU-enabled version of the MPAS-Atmosphere dynamical core has been available. The GPU-acceleration is enabled with OpenACC directives, and has been tested on several Nvidia GPU architectures (e.g., V100, A100, H100) with the NVHPC compiler. MPAS v8.4.0 continues the v8 GPU porting efforts, introducing optimizations to the data movements between host (CPU) and device (GPU) memory. New to v8.4.0, much of the host-to-device data movements are performed once per time step, before and after the dynamical core execution, rather than at each subroutine call within the dynamical core. Additionally, GPU-aware halo exchanges have been implemented for the dynamical core.
Presently, GPU execution is only available for the dynamical core, and not for the physics suites. Future efforts will focus on porting physics suites to GPUs as well.
To enable GPU-acceleration, the model must be built with OpenACC support, by using the nvhpc toolchain and adding OPENACC=true to the build command
> make nvhpc CORE=atmosphere OPENACC=true
The standard MPAS requirements from Chapter 3 still apply: MPI, netCDF, parallel-netCDF, are required, and all libraries should be built with a compatible compiler/MPI stack.
Runtime configuration on GPU nodes¶
Typical MPAS-Atmosphere model runs on GPU nodes use one MPI task per GPU. The mapping of MPI tasks to individual GPU devices is controlled by the environment variable CUDA_VISIBLE_DEVICES, which should be set to a comma-separated list of GPU device IDs corresponding to each MPI rank.
When running MPAS-Atmosphere on GPUs, the log files will also indicate the devices (GPUs) visible to the run, and the device driver in use:
OpenACC configuration:
Number of visible devices: 1
Device # for this MPI task: 0
Device vendor: NVIDIA
Device name: NVIDIA A100-SXM4-40GB
Device driver version: 13000
An additional consideration is to bind MPI ranks to certain CPU cores for optimal performance. In systems with NUMA architectures, it is generally recommended to bind each MPI rank to CPU cores separate NUMA domains. Accessing memory across NUMA domains can lead to performance degradation.
Using GPU-aware halo exchanges¶
MPAS v8.4.0 also includes support for GPU-aware halo exchanges, which can be enabled by setting config_gpu_aware_mpi to true in the model namelist. When this option is enabled, most halo exchanges occurring within the dynamical core execution will be performed directly from the device (GPU) memory, without needing to stage data through host (CPU) memory. Enabling GPU-aware halo exchanges typically improves performance by reducing data transfers between the host and device. This option requires that the MPI implementation used to build MPAS supports GPU-aware or GPU-direct communications.
Verifying and debugging GPU runs¶
In general, the results obtained from GPU model runs may not be bitwise identical to those obtained from CPU runs, largely due to differences in the order of operations, the use of fused multiply-add (FMA) operations, and the differences in the implementation of instrinsic mathematical functions (e.g., sin, cos, etc.) on CPUs and GPUs. However, under certain conditions, it is possible to verify that the NVHPC GPU runs are producing bitwise identical results to NVHPC CPU runs by the use of some compile-time options. Build flag -Mnofma switches off the use of fused multiply-add (FMA) operations, and the option -gpu=math_uniform ensures that the CPU algorithms for intrinsic functions are used on the GPU as well. As these options can lead to performance degradation, they are not recommended for production runs, but they can be useful for verifying that GPU and CPU runs are producing identical results.
The NVHPC compiler provides run-time options for generating debugging information from OpenACC model runs, which can be useful for diagnosing issues with GPU runs. For example, setting the environment variable NVCOMPILER_ACC_NOTIFY=3 prior to launching the model generates detailed information about OpenACC kernel launches and host-device data movements.
Tips for running on GPU nodes¶
In many managed cluster environments, there may be convenience wrappers for launching GPU-enabled jobs, which will automatically set the CUDA_VISIBLE_DEVICES variable. For example, on the NCAR Derecho and Casper systems, the set_gpu_rank utility automatically sets the mapping of MPI ranks to GPU devices. As an example on Derecho, launching the MPAS-Atmosphere model on 4 GPUs with 1 MPI rank per GPU can be done with the following command in a batch job script:
> mpiexec -n 4 -ppn 4 set_gpu_rank ./atmosphere_model
For example, on the NCAR Derecho system, to execute the MPAS-Atmosphere model on 4 GPUs with 1 MPI rank per GPU, and binding each MPI rank to a different NUMA domain, the following command can be used in a batch job script:
> mpiexec -n 4 -ppn 4 –cpu-bind verbose,list:0:16:32:48 set_gpu_rank ./atmosphere_model