------------------------------------------------------------
NOTE: There is new material at the end of this file: 991217
------------------------------------------------------------
 
-------------------------------------------------------------

Prototype WRF (Weather Research and Forecast) model.  8 March 1999

First try, constructed by

W. Skamarock  (skamaroc@ncar.ucar.edu, (303) 497-8893)
J. Michalakes (michalak@ncar.ucar.edu, (303) 497-8199)
J. Dudhia     (dudhia@ncar.ucar.edu,   (303) 497-8950)
D. Gill       (gill@ucar.edu,          (303) 497-8162)
J. Klemp      (klemp@ucar.edu,         (303) 497-8902)


-> Outline

This document is organized into the following sections:

     -> Outline
     -> Preface -
     -> WRF Model Features needing consideration -
     -> Compiling and running:
     -> WRF model specifics -
     --> Nomenclature:
     --> Software architecture:
     --> WRF call tree:
     --> Data Registry:
     ---> Variables that are designated Part 3d, 2d, or 1d:
     ---> Variables that are designated Part 3d_chem or 3d_moist:
     ---> Variables that are designated Part '-':
     ---> Variables that have non '-' entries the Decoup field:
     ---> Variables that have a non '-' entry for the Tendency field.
     --> Namelist organization:
     --> Parallelism
     -> Notes:

-> Preface -

In this directory structure sits the first prototype WRF model, written
by Skamarock, Michalakes, Dudhia, and Gill using a solution technique
developed by Klemp, Skamarock and Dudhia.

The WRF modeling systems, outlined in the WRF model working group
reports, is comprised of preprocessors for data assimilation, model
initialization etc., a model that integrates the equations of motion on
some grid or system of grids, and post-processors for examining the
model forecasts.  This prototype model is a prototype of the 
integration engine (which we call the "model").  It is not a
full-fledged model.  Presently, it capabilities and limitations are:

  1) The solver within the model integrates the dry fully compressible
      equations of motion.

  2) Shared memory and distributed memory parallel processing
      capabilities, making use of domain-decomposition methods,
      have been implemented in this prototype.

  3) The model has the capability of integrating multiple nested
      grids, but the routines which will link the grids
      (essentially interpolation routines that pass data of various
      forms between the grids) have not yet been constructed.
      The data structure that stores the grid fields and the algorithms
      that control the integration sequence exist and are functioning
      in this model version.

  4) The model contains no physics, and uses periodic lateral boundary
      conditions and rigid free-slip surfaces at the bottom boundary
      and upper rigid lid.

  5) This model uses low-order numerics (2nd order centered time
      and space differencing) and a split-explicit time-integration
      scheme.

  6) The model does carry around (and advect/integrate) an arbitrary
      number of scalars representing moisture and an arbitrary
      number of scalars representing chemical species, the former for
      use in a moist model and the latter for use in air quality and
      atmospheric chemistry studies.

  7) With the exception of namelist input and some rudimentary output,
      the model is essentially I/O-less.  Initial conditions are
      idealized cases that are computed in the model itself; output
      consists of field dumps, either ascii dumps or gmeta (NCAR
      graphics) for the purposes of initial development and debugging.

This prototype is not meant to be used for forecasting or science at
this stage.  The prototype is being released for the purpose of allowing
all interested parties to examine the programming constructs and coding
style we have employed.  We welcome feedback from anyone who wishes to
comment on the prototype.  We would like to critically evaluate the
prototype at this early stage in model construction, with the goal of
incorporating improvements and removing problematic constructs, before
significant code is developed for more sophisticated models.

The data structure and the algorithms for managing multiple grids can be
considered to be fairly complete in most aspects (outside of the
routines that communicate data between the grids).  While many parts of
the discretization will definitely not be those employed in the WRF
model, the time-integration scheme represents a class of schemes which
may become the basis of the WRF model.  Hence the solver in the
prototype, in particular the top section where the time-integration of a
grid is controlled, should also be considered fairly complete and
representative of what might actually be used.

A major part of the new WRF code, documentation (both in-code and
offline) is not complete.  We would welcome comments on where more
in-line documentation is needed, and you are presently looking at almost
all of the offline documentation that exists.  Please understand that we
did not want to spend lots of time documenting code that would likely
be discarded later, and we wanted to get this initial prototype design
out to everyone as soon as we could.

--------------------------------

-> WRF Model Features needing consideration -

As noted above, this isn't meant to be the WRF model with respects to
the numerical discretization we've employed.  There are many aspects of
the model architecture that we are very interested in critiquing.

  1) The use of Fortran 90/95.
  2) The use of cpp for some model configuration choices.
  3) The use of PERL for automatic generation of critical parts
      of the data structure using user-specified variable lists.
  4) The parallel processing implementation for shared and/or
      distributed memory parallelism.

We welcome comments on any other model features as well.  In particular,
we need to consider/decide on the index ordering within the model, and
I/O methods/formalism and file structures for the model gridded data.


---------------------------------

-> Compiling and running:

(((( this section was updated June 26, 2000 )))))

Compiling WRF

WRF is supported on a number of different platforms in single
processor, shared-memory parallel, distributed-memory parallel, and
hybrid parallel modes. Much of this flexibility is managed at compile
time. WRF is compiled under the control of the UNIX make utility, with
help from csh scripts and a perl script. The command to build WRF is
./compile in the top level WRF directory. Compile is a csh script that
first checks to see if the code has been configured. It looks for the
existence of a configure.wrf file, which contains architecture- and
build-specific compile and link time setttings.  If (and only if) there
is not a configure.wrf file in the top level directory, the compile
script invokes the configure script. Configure, which can be invoked by
itself as ./configure, is a shell script whose main task is to invoke
the perl script in arch/Config.pl. However, the configure script first
checks to find perl on the system. If it can't fine a perl executable
version 5 or better, it will fail with an error message. If perl 5+
does exist on the system, set the shell environment variable PERL to
the absolute path to the perl command on the system, and the configure
script will use that.

The arch/Config.pl perl script generates the configure.wrf file that is
included in the Makefile when WRF is made. It will present a list of
available build options to the terminal. The choices are stored in the
file arch/configure.defaults. The user choses one of these, and the
configure.wrf file is generated. The user may edit the configure.wrf
file to fine tune the settings, but the file will be overwritten each
time the configure script is run. As long as there is a configure.wrf
file in the top level directory, the compile script will not evoke the
configure script, so changes to configure.wrf can be reused through
multiple invocations of the compile script. However, to make a change
permanent, it must be made to the arch/configure.defaults file.

When a configure.wrf file is found, compile invokes the make utility to
build the WRF model in the src directory. If the settings in
configure.wrf call for it, packages and libraries in the external
directory will also be built the first time WRF is built or after a
clean a command. The first thing make does is to invoke the Registry by
calling perl to run the script tools/use_registry on the Registry file,
located in the Registry directory.  Then the WRF source files are
compiled and linked in the src directory, including files from the inc
directory and, if specified, linking to packages in the external
directory. The resulting executable is wrf.exe in the run directory,
along with a file, namelist.default, which is generated by the Registry
from the rconfig table information in the file Registry/Registry (for
more information on the Registry see Section 9).  Currently (June 2000)
the compile script also builds a program, ideal.exe, which is used to
generate input for the wrf.exe program. Namelist.default should be
copied to the file the WRF model reads:  namelist.input.

A clean script is also provided.  The command ./clean will clean up
the .o files and other temporaries in the src directory. The
command ./clean -a will also delete the configure.wrf file and will
also clean up libraries and other packages built in the external
directory.

Running WRF

Some of the details for running WRF are architecture dependent. In
general, however, for single-processor or shared memory parallel runs,
change directories to the run directory and type ./wrf.exe in the run
directory.  Some shared memory parallel platforms require the
environment variable OMP_NUM_THREADS to be set to the number of shared
memory threads to be used. In addition, for shared memory also set the
environment variable WRF_NUM_TILES  to $OMP_NUM_THREADS (this is a
temporary requirement). For distributed memory runs, it is usually
necessary to use some form of the mpirun command; for example:

mpirun np 4 wrf.exe

however, additional options may be necessary for your particular
system.  Some systems may have different commands entirely for running
parallel jobs (e.g. dmpirun on Compaq systems or poe on IBM SP
systems), and there may also be batch scheduling issues to consider.
Consult your system documentation or administration staff.

(Temporary instructions) To run WRF it is necessary to first generate a
set of initial conditions which will be read in from the file wrfinput.
Set the io_form option in the namelist.input file to '1' and set other
options such as grid dimensions, dx, dt, etc. and then type ./ideal.exe
in the run directory. This will generate the file wrfinput. As long as
the basic grid specifications are not changed in the namelist.input
file, you can continue to reuse the wrfinput file for multiple runs of
the wrf.exe code. If basic options change, it will be necessary to
rerun ideal.exe.

(Temporary instructions) Edit the namelist.input file to set such
run-specific items as number of timesteps, output frequency, etc. and
then run the wrf.exe code using the procedure for your system (see
above). WRF will input the namelist.input file and also the wrfinput
file of initial conditions. As it runs it will output to the file
wrfoutput file. Output is in MM5v3 format. This is temporary;
eventually, the namelist io_form  option will allow specification of
other formats. For single processor and shared-memory parallel runs,
standard output will come to the terminal. For distributed-memory
parallel runs (assuming RSL is used as the communication layer),
standard output will be written to files name rsl.out.xxxx where xxx is
the 4-digit processor number. Standard error will be written to
rsl.error.xxxx. Currently (June 2000) only RSL communication is
implemented in WRF but this is temporary. RSL is provided with the WRF
code in the external/RSL directory.


---------------------------------

-> WRF model specifics -

--> Nomenclature:

   Solver   Code called to integrate a domain forward through one time step

   Patch    A subset of a model domain allocated to a single address space.
            This is the entire domain unless distributed-memory parallelism
            is employed.

   Tile     A subset of a patch executed by one thread.  This domain
            subset is bounded spatially and temporally by the amount of
            computation that may occur without concern for coherency
            before horizontal dependencies arise.  No assumptions may
            be made inside the tile code about the order in which tiles
            will execute.

   Process  An operating system construct: one address space,
            containing one or more threads of execution.  A patch is
            computed by a process.

   Thread   A series of instructions through a code (plus a small
            amount of local data -- stack and thread-local heap
            storage; only the former is used in WRF).  Multiple threads
            can exists through a single block of code -- say, a
            parallel DO loop, and they can all access shared data
            within the process.   In WRF, a thread is what executes a
            tile.  There may be more threads than tiles or more tiles
            than threads.  In the former situation, some threads will
            be idle; in the latter, each thread will execute several
            tiles in sequence.

   State    Data that persists over the life of a domain (across
            multiple calls to the solver).  May be communicated between
            patches and shared between threads.

   I1       Data that persists over the life of a single call to the
            solver but which need not persist across calls to the
            solver.  Tendency data would be an example.  May be
            communicated between patches and shared between threads.

   I2       Data that persists only for the life of a single call to a
            package within the solver.  Strictly "local" data.  May not
            be communicated nor shared between threads.

   Logical domain dimensions

            This is the logical size of the model domain, exclusive of
            extra points for any physical boundary region (e.g.
            periodic).  In a model subroutine, these come in through
            the arguments: ids, ide (west/east start and end of
            domain); jds, jde (south/north start and end of domain);
            and kds, kde (vertical start and end of domain).  In the
            case of staggering, the logical domain dimensions are large
            enough to hold the largest dimension of any field in that
            dimension.   In the subroutine, these may be used for
            boundary tests or other instances where the position of an
            index relative to the global domain is needed.  These
            should not be used in loop ranges except within max/min
            intrinsics for boundary testing, nor should they be used to
            dimension arguments or memory in the subroutine.  Starts
            and ends of dimensions are expressed in global
            coordinates.  In the current implementation, i runs west to
            east with ascending indices; j runs south to north with
            ascending indices; and k runs from bottom to top with
            ascending indices.

   Patch domain dimensions

            These are the dimensions of the logical patch -- the set of
            points that are assigned to be computed locally on this
            process, not including halo or boundary regions.  Patch
            dimensions are thus distinct from memory dimensions (below)
            in that memory dimensions must be greater than or equal to
            patch dimensions.  Patch dimensions are available at the
            solver level but are generally not passed to model
            subroutines in this version of the prototype.  The
            dimensions ips, ipe; jps, jpe; and kps, kpe and may be
            passed into routines at the author's discretion.  They may
            be used for testing for patch boundaries; they should not
            be used as loop ranges.  Starts and ends of dimensions are
            expressed in global coordinates.

   Memory domain dimensions

            This is the memory size of the model domain inclusive of
            all additional memory in each dimension (for halos,
            boundaries, paddings for cache alignment, etc.).   In a
            model subroutine, these come in through the arguments: ims,
            ime (west/east start and end of domain); jms, jme
            (south/north start and end of domain); and kms, kme
            (vertical start and end of domain).  Starts and ends of
            dimensions are expressed in global coordinates.  In the
            subroutine, these are used for dimensioning argument
            arrays, local arrays, etc.  Use of these for any other
            purpose is strongly discouraged, since the subroutine may
            make no assumptions about the actual size of the memory
            other than that it is sufficiently large.

   Tile domain dimensions

            This is the computational extent of the model domain within
            a model subroutine.  It includes only the cells that are to
            be computed in that call to the subroutine.  Though this is
            to be avoided, the subroutine may, at the author's
            discretion, chose to compute outside this range, but the
            dimensions passed into the routine are those for which the
            routine is responsible for computing.  These come in to the
            subroutine though the arguments  its, ite (west/east start
            and end of domain); jts, jte (south/north start and end of
            domain); and kts, kte (vertical start and end of domain).
            Starts and ends of dimensions are expressed in global
            coordinates.

   Halo

            Extra memory allocated solely for the purpose of
            interprocessor edge data exchanges between patches if
            multiple patches exist.  The memory domain dimensions
            subroutine must be larger than the patch dimensions by at
            least the amount necessary to allow for halos on one or
            both sides of the patch in that dimension.

   Physical Boundary Region

            Points that extend the logical domain boundaries by a
            sufficient amount to permit certain boundary conditions:
            symmetry, periodic.  In the case of patches that have at
            least one side on the edge of a logical domain, the memory
            domain dimensions must be larger than the patch dimensions
            by at least the amount necessary to allow for a physical
            boundary region.

--> Software architecture:

The model calling structure is organized both horizontally and
vertically.  As described elsewhere [1], the horizontal organization is
into layers: the driver layer, the model layer, and a mediation layer.
The horizontal stratification of the model architecture provides
separation of concerns between the driver layer -- encapsulating
hardware architecture, shared and distributed memory parallel , input
(namelist, initial data, etc.) and output concerns -- and the model
layer, which consists of strictly computational routines of the code,
written to be callable for a single "tile".  Between the driver and the
model layers sits the mediation layer, the "glue" layer which knows
something about both the other layers.  Obviously we try to keep this
layer as thin as possible.

The vertical organization of the call tree is into sections by broad
function:  basic initialization, configuration, initialization, time
integration, and model output.  In schematic, the overall organization
of WRF is as follows:

              basic
              init.    config.    init.    integ.    i/o

            +=======+==========+=========+========+========+
            |       |          |         |        |        |
 Driver     |       |          |         |        |        |
            |       |          |         |        |        |
            |       |          |         |        |        |
            +=======+==========+=========+========+========+
 Mediation  |       |          |         |        |        |
            +=======+==========+=========+========+========+
            |       |          |         |        |        |
            |       |          |         |        |        |
 Model      |       |          |         |        |        |
            |   .   |    .     |    .    |   .    |    .   |
                .        .          .        .         .
                .        .          .        .         .

What this diagram conveys is that each section of the model will enlist
driver layer, model layer, and mediation layer components to perform
its function.  (A fair amount of effort has been invested in developing
this WRF prototype code in *actually adhering* to this conceptual
organization, though there are sure to be some occasional divergences
in this early initial version.)  One key motivation for this
organization is reuse:  the model layer may be interfaced to other
architecture- or application-specific driver layers, including the use
of computational frameworks.  Likewise, the driver framework should be
reusable with other model layers.  Limiting information about
parallelism, for example, to the driver and mediation layers, allows
the investment in model fortran routines to be preserved and exploited
with efficiency on a range of diverse computer architectures.

Although some effort has been made to provide flexibility in the choice
of array index order in the driver layer, this has been essentially
fixed to k-innermost, i, then j-outermost in the model layer.  The
reason for this is that there are no constructs or mechanisms in the
Fortran language itself that allow for reordering these without
recoding.  A suitably engineered source translation engine could handle
this easily as a pre-processing step prior to compilation.

--> WRF call tree:

wrf() [wrf.F ; Driver ]
 |
 +-- init_modules()    [ init_modules.F ; Driver/basic-init ]
 |
 |      Each module <module_name> that is "used" by wrf provides a
 |      routine in its CONTAINS section called init_<module_name>.
 |      This routine calls all of these.  If one adds a new module to
 |      WRF, such a routine should be provided in the module (may be a
 |      stub) and a call statement inserted in this routine.
 |
 +-- initial_config()  [ module_configure.F ; Driver/config ]
 |   |
 |   +-- read_namelist_data()   [ module_configure.F ; Mediation/config ]
 |
 |      Takes as input a fortran unit number and an empty instance of 
 |      a model_config_rec_type and returns the structure populated with
 |      model configuration information.
 |                                
 |        [ Registry interfaces (see Sec 4 in src/use_registry): ]
 |        [                                                      ]
 |        [ #include <state_namelist_defines.inc>                ]
 |        [ #include <state_namelist_statements.inc>             ]
 |        [ #include <state_namelist_reads.inc>                  ]
 |        [ #include <state_namelist_assigns.inc>                ]
 |                                
 +-- alloc_and_configure_domain  [ module_domain.F ; Driver/config ]
 |   |                            
 |   |  Given a domain id, a model_config_rec_type structure, a pointer to
 |   |  a grid and parent domain type (if applicable), configure and allocate
 |   |  space for this new domain and hook it into the nest hierarchy.
 |   |                            
 |   |    [ Registry interfaces (see Sec 4 in src/use_registry): ]
 |   |    [                                                      ]
 |   |    [ # include <config_namelist_01.inc>                   ]
 |   |    [ # include <config_namelist_02.inc>                   ]
 |   |    [ # include <config_namelist_03.inc>                   ]
 |   |    [ # include <config_namelist_04.inc>                   ]
 |   |                            
 |   +-- patch_domain  [ module_domain.F or elsewhere ; Driver/config ]
 |   |                            
 |   |  Given a domain id, a parent id (if avail), the logical domain
 |   |  dimensions, and the width of the physical boundary region around
 |   |  the logical domain, calculate a decomposition and return the
 |   |  memory domain dimensions and patch domain dimensions necessary
 |   |  on each processor.
 |   |                            
 |   |  The version of this routine in module_domain.F is for the
 |   |  trivial case where there is only one patch and so, there is
 |   |  effectively no decomposition.  It merely sets the patch dimensions
 |   |  the same size as the logical domain dimensions and then adjusts
 |   |  the memory dimensions to be large enough to contain any physical
 |   |  boundaries that may have been specified via the namelist.
 |   |                            
 |   |  As distributed memory capabilities are added to WRF, they go
 |   |  into source files named module_dm.F and this module should supply
 |   |  a routine to replace the patch_domain routine defined here.  For an
 |   |  example, see the existing module_dm.F.
 |   |                            
 |   +-- alloc_space_field()   [ module_domain.F ; Model/config ]
 |   |                            
 |   |  Given a pointer to a new domain data structure of TYPE(domain)
 |   |  and the memory domain dimensions, allocate 2-, 3-, and 4-dimensional
 |   |  data structures.  When grid is returned, it's fully allocated.
 |   |                            
 |   +-- define_comms()  [ module_dm.F ; Driver/config <DM_PARALLEL ONLY>]
 |                                
 |      DM_PARALLEL only.  This is a call to a routine in the module_dm.F
 |      library that can be used to set up communication on domain state
 |      data fields, now that they are allocated.
 |                                
 +-- init_domain     [ module_initialize.F ; Mediation/init ]
 |   |                            
 |   |  Given a grid data structure, initialize the fields therein.  This
 |   |  will eventually involve I/O but the WRF prototype initializes from
 |   |  idealized cases that are computed here and in subroutines.
 |   |                            
 |   |  DM_PARALLEL only.  Calls to routines in module_dm.F may occur
 |   |  here to update halos and periodic boundary conditions when
 |   |  domains are decomposed over multiple patches.  The routines
 |   |  called now are wrf_dm_halo() and wrf_dm_boundary(), defined in
 |   |  module_dm.F.  Here and other places requiring communication,
 |   |  comments are included showing the fields that need to be
 |   |  communicated and the stencils on which data need to be
 |   |  exchanged.   These are informational only.
 |   |                            
 |   +-- init_coords()     [ module_initialize.F ; Model/init ]
 |   +-- init_base_state() [ module_initialize.F ; Model/init ]
 |   +-- init_state()      [ module_initialize.F ; Model/init ]
 |   +-- set_physical_bc() [ module_bc.F ; Model/init,integrate ]
 |                                
 |      This is a routine that updates boundary conditions (periodic,
 |      symmetric) if the boundaries can be updated without
 |      communication.  Otherwise the wrf_dm_boundary() routine is
 |      assumed to have updated the boundary region.
 |                                
 +-- integrate       [ module_integrate.F ; Driver/integrate ]  << RECURSIVE >>
     |                            
     |  This routine takes a grid, a starting and an ending time value and
     |  updates grid and all of its children from the starting time to the
     |  ending time.  It handles nests (as children of grid) and also
     |  overlapping domains on the same nest level (as siblings of grid).
     |                            
     +-- solve_interface    [ solve_interface.F ; Mediation/integrate ]
     .   |                        
     .   |  This call takes a single argument, grid, which contains all
     .   |  the state data for a single domain, and when
     .   |  solve_interface returns, the grid has been updated one time
     .   |  step.  From this point down in the call tree, the scope of
     .   |  the computation is a single domain.  Solve-interface's main
     .   |  purpose is to be the dereferencing agent for grid, so that
     .   |  above this routine in the call  tree, the domain is known
     .   |  only as a single name.  Below this routine, in solve, all
     .   |  data associated with the grid is passed through argument
     .   |  lists.   The reason for this is partly for performance to
     .   |  minimize the cost of dereferencing the grid data structure
     .   |  to once per time step.  Secondarily, this allows the solver
     .   |  to be written to pass data exclusively through argument
     .   |  lists, so that other drivers can be interchanged, whether
     .   |  or not the new driver represents the domain as a single F90
     .   |  structure (computational frameworks, for example, may
     .   |  instead represent the grid as multiple distributed
     .   |  arrays).  At the level below solve interface, it is not a
     .   |  concern how the driver is storing the state data for a
     .   |  domain.
     .   |                        
     .   |  The dereferencing and the tedious and error prone listing
     .   |  of the arguments and their fields in the grid data structure
     .   |  is automated using the Registry mechanism.  Thus, if the
     .   |  set of state fields in the model changes, it is not necessary
     .   |  to reconstruct the dereferencing or actual argument lists here
     .   |  or below in solve, where the dummy arguments are also listed
     .   |  using a Registry-generated file.
     .   |                        
     .   |   [ Registry interfaces (see Sec 2 in src/use_registry): ]
     .   |   [                                                      ]
     .   |   [ #include "solve_actual_args.inc"                     ]
     .   |                        
     .   |  This routine also increments the time counter for the
     .   |  grid.
     .   |                        
     .   |  This routine includes an interface block definition of the
     .   |  solver that it calls, called solve.int.
     .   |                        
     .   +-- solve          [ solve.F ; Mediation/integrate ]
     .                            
     .      The top level flow of control over a single time step is
     .      here.
     .                            
     .      This routine is the principle routine of the mediation
     .      layer, containing some information that is of concern to
     .      the the driver layer and some information that is of
     .      concern to the model.  Everything below this layer of the
     .      code is considered Model.
     .                            
     .      There are no communication or threading calls or directives
     .      below the solve subroutine.
     .                            
     .      All state data is passed to solve through its argument list.
     .                            
     .       [ Registry interfaces (see Sec 2 in src/use_registry): ]
     .       [                                                      ]
     .       [ #include "solve_dummy_args.inc"                      ]
     .       [ #include "solve_dummy_arg_defines.inc"               ]
     .                            
     .      I1 data (tendencies, decoupled versions of prognostic
     .      variables) are defined in solve and allocated on the
     .      program stack.  This avoids the need to have more than a
     .      single version of these occupying memory at a time, even
     .      when multiple domains are being simulated.
     .                            
     .      Virtually without exception, no computation is done in
     .      solve, rather it determine the tiling of the domain with a
     .      call to get_tiles(), and then every section of the
     .      underlying model is called within loops over tiles and
     .      these loops may be multi-threaded using DO PARALLEL
     .      directives.  Put another way, anything that is called from
     .      solve is called for a tile (the tile may be the entire
     .      patch, or it may be subdivided into smaller units that can
     .      be assigned to multiple threads).  The routines below solve
     .      are written to be callable for a single tile and by
     .      definition do not involve horizontal data dependency.
     .      Whenever the computation reaches a point where horizontal
     .      data dependency becomes an issue, control reverts to the
     .      the solve level -- any communication that needs to occur is
     .      performed using calls to wrf_dm_halo and/or wrf_dm_boundary
     .      and then the next possibly multi-threaded loop over tiles
     .      is initiated. 
     .      
     .      The standard interface (formatting and style is not
     .      addressed here) to a Model subroutine called from this
     .      layer and below is:
     .                            
     .        SUBROUTINE sub  ( arg1, arg2,  ... argn,     &
     .                ids , ide , jds , jde , kds , kde ,  & ! domain dims
     .                ims , ime , jms , jme , kms , kme ,  & ! memory dims
     .   optional-> [ ips , ipe , jps , jpe , kps , kpe ,  & ! patch dims ]
     .                its , ite , jts , jte , kts , kte )    ! tile dims
     .        IMPLICIT NONE
     .        ! 3D State arrays
     .        REAL , DIMENSION ( kms:kme,ims:ime,jms:jme ) :: arg, arg, ..
     .        ! 2D State arrays
     .        REAL , DIMENSION ( ims:ime,jms:jme ) :: arg, arg, ..
     .         ...
     .        DO j = jts , jte        ! Loop ranges use tile dimensions
     .          DO i = its , ite
     .            DO k = kts , kte
     .              ...
     .                            
     .      Currently in the WRF prototype, the TYPE(domain) "grid"
     .      argument is also passed, but this may be eliminated.
     .                            
                                  
     R-- Integrate() calls itself for subdomains (nests) and other
         domains on same nest level.

--> Data Registry:

The Data Registry is an additional component of the WRF software
architecture, which perhaps does not fall easily into the
Driver/Mediation/Model taxonomy, but which provides a mechanism for
abstracting information about model data in table form, and then
propagating this information across numerous modules and interfaces
between modules in the code so that, for example, adding or deleting a
state variable or modifying the namelist semantics is a one-line change
to the Registry table and a recompile.  Currently, the WRF Registry is
implemented using an ASCII table named "Registry", a PERL script, named
"use_registry", and the CPP preprocessor.  The use_registry script
in the src directory contains detailed documentation of what files
are generated from the Registry table, and where these are used in
the code.  What follows here is a description of the form of a Registry
table entry and what this means for the code at-large.

A registry entry is a line in the registry table and each field
(currently there are ten) are separated by white space (one or more
spaces or tabs).  A sample registry entry, which adds the state
variable for the u-component of winds, is:

   state  real  ru  3d  2  X  u  ru_tend  init_domain

Note that the tenth field, a default value used for namelist entries,
is missing for ru, a prognostic variable.  The fields if a state entry
are:

     Field     Allowed values       Other information

1.   Code      state                This flags the registry entry
                                    as an element of the state data
                                    table.  currently the state data
                                    table is the only table represented
                                    in the Registry file.

2.   Type      real, integer, ...   The type of the state data variable

3.   Sym       <Fortran ident>      The name of the variable (ru in
                                    the example)

4.   Part      3d                   What part of the TYPE(domain) this
               3d_moist             variable occupies.
               3d_chem                   
               2d                   
               1d                   
               -

5.   NumTLev   1, 2, 3, or -        Number of time levels for variable.

6.   Stagger   X, Y, Z, or -        Staggering dimension or - if not 
                                    staggered.

7.   Decoup    <Fortran ident>      The name(s) to use for decoupled
               or -                 forms of the variable, if applicable.

8.   Tend      <Fortran ident>      The name to use for the tendency for
               or 'yes'             this variable, or, in the case of
               or -                 moisture and chemistry fields, whether
                                    to have a tendency array.

9.   Whereinit <routine name>       Where the variable is first initialized.
               or <namelist record> In the case of namelist variables,
               or namelist_derived  the namelist record identifier is specified.
                                    In the case of configuration variables that
                                    whose value is derived from the namelist
                                    but which do not appear in the namelist
                                    themselves, the string namelist_derived
                                    is specified.

10.  Default   initial value        applicable to namelist variables that
                                    actually appear in the namelist only.

The following is additional discussion of the Registry entry
fields listed above.

---> Variables that are designated Part 3d, 2d, or 1d:

If the Registry "part" name for a variable is 3d, 2d, or 1d, then it is
stored in a field named "data_<part>" in the TYPE(domain).  The
registry mechanism also generates a defined integer constant whose name
is:

     ENUM_<part name>_<Symbol>[_<Timelevel>].

These constants may be referenced in any subroutine that USEes
module_state_description.  For example, a reference to the
first time level of variable RU is:

      grid%data_3d[ENUM_3D_RU_1]%data

For part 3d, 2d, and 1d, variables that have only 1 time level,
the reference will be (using the 3D field 'PIB' for example):

      grid%data_3d[ENUM_3D_PIB]%data

It is rarely necessary for the programmer to be concerned with these
structures, since they are employed only above the Model layer of the
code.   At the interface between the Driver layer and the Model layer,
these references are automatically generated and they appear only in
the routine solve_interface().  In the solve() routine and below, the
variables are passed explicitly through argument lists and are known by
their Registry symbol name, with the time level appended  in the case
of multiple time level variables.

RU, which has 2 time levels, is known simply as  RU_1 and RU_2 in 
solve() and below.  Variables that have only 1 time level, such as
PIB, are known simply by their Registry symbol name in solve and below.
        
---> Variables that are designated Part 3d_chem or 3d_moist:
       
Moisture and chemistry fields are treated differently because there are
many operations on these that can be performed en masse.   They appear
in the model as four-D arrays, and the fourth dimension is "species".
There are separate four-D arrays for each time level.

If the Registry "part" name is 3d_moist, the field is stored as species
in the field data_3d_moisture_<timelevel> where timelevel is 1, 2, or 3
(See note (b) for discussion of time-levels).  Likewise, if the
Registry part name is 3d_chem, the array is a species in the field
data_3d_chemistry_<timelevel>.  The index into the four-d moisture or
chemistry field array is given a defined constant with the name:

      ENUM_<part name>_<Symbol>

Again, these constants may be referenced in any subroutine that USEes
module_state_description.  Since there are separate four-D arrays for
each time level, the time level is not encoded into the name of the
defined constant.  Thus, a reference to the first time level of
moisture field Q0 is:

      grid%data_3d_moisture_1(:,:,:,ENUM_3D_MOIST_Q0)

above solve() and just:

      moist(k,i,j,ENUM_3D_MOIST_Q0)
        
within solve() and below.

---> Variables that are designated Part '-':

If the Registry "part" name is '-', then the variable is stored
as itself in the field.  For example, the zero-dimensional variable
named ZETATOP is known as grid%zetatop in solve_interface() and above
and simply as ZETATOP in solve() and below.

---> Variables that have non '-' entries the Decoup field:

RU in WRF is Rho-U.  But it is also used in the solver as just U,
decoupled from Rho (prognosed density).   Rather than decouple the
variable each time this form is needed, I1 storage is declared in Solve
for just U.  This is done automatically through the Registry.  The
'Decoup' field of the Registry entry for RU specifies the name of the
decoupled form, which causes solve() to include a local declaration of
the three-D variables U_1 and U_2, for each time level.  If only one
time-level is specified, then the the _1 suffix is omitted from the
variable name.  Multiple decoupled variables are permitted for each
Registry symbol.  For example, in the current src/Registry table,
the variable RTP has three decoupled forms: TP, T, and RT.  Declarations
are automatically generated for TP_1, TP_2, T_1, T_2, RT_1, and RT_2.

Currently, one cannot specify decoupled I1 variables
for part 3d_moist and part 3d_chem variables.

---> Variables that have a non '-' entry for the Tendency field.

Tendency arrays have the same dimensionality as the variable for which
they appear.  There is only ever one time level.  If a tendency
variable is specified for a part 3d, part 2d, or part 1d variable,
a local declaration of the tendency array is generated in solve().
For example, the Registry entry for RU specifies a tendency array
named RU_TEND.  A declaration for this is automatically generated
and RU_TEND appears as a local array in solve().

In the case of part 3d_moist and 3d_chem variables, the tendency array
is also four dimensional.  Thus, if the Registry entry for any of these
specifies a tendency array, they all get one.  These four-D tendency
arrays appear as local arrays in solve() and they have the names
moist_tends and chem_tends, respectively.  The index into the tendency
array used for a particular species is the same defined constant
as is used for accessing the variable.  For example, the tendency
for moisture variable Q0 is:

     moist_tends(:,:,:,ENUM_3D_MOIST_Q0)

--> Namelist organization:

There are four records in the namelist: namelist_01, namelist_02,
namelist_03, and namelist_04.  The first record, namelist_01, is for
model-wide settings and the elements of namelist_01 are
zero-dimensional (e.g. time_step_max, the number of coarse domain time
steps to execute for the simulation).  The elements of the remaining
namelist records contain per-domain settings.  Each element of these
remaining records is a 1-dimensional array of size 'max_dom' (max_dom
is a model-wide setting read in as part of namelist_01).  Each element
of the 1-D namelist_02, namelist_03, and namelist_03 records is a
setting for a given domain.

Namelist elements are considered state and are listed in the Registry
file.  The "Whereinit" field for a namelist element is either
namelist_01 (in which case it is considered a zero-dimensional,
model-wide setting) or namelist_02, namelist_03, or namelist_04, in
which case the namelist element is a 1-dimensional vector of max_dom
elements.  Configuration data items that are not in the namelist itself
but that are derived from namelist variables (and passed around in the
model within the model_config_rec_type configuration record) should
also be listed in the Registry file, with a "Whereinit" field of
namelist_derived.  These namelist_derived fields will be included in
the model_config_rec_type'd configuration record but won't be expected
(or allowed) in the namelist input file itself.  A reasonable place to
generate derived namelist variables from input namelist variables is
within or below the routine read_namelist_data.

--> Parallelism

Mechanisms for distributed-memory and shared-memory parallelism
are isolated to a single routine of the Mediation layer, solve().
Distributed memory parallelism is encapsulated within a module,
module_dm.F .  We envision multiple instances of module_dm.F, and
the choice of which to use will be made by the make routine.
The rest of the code calls routines provided in module_dm.F for
such services as halo-exchanges, periodic boundary communication,
basic decomposition.  There are presently no real I/O routines
in WRF, however once these functions appear, they will also be
provided by the module_dm.F for a particular architecture and/or
message passing software layer.  All calls to such routines in 
and USE declarations for this module are guarded by CPP conditional
compilation directives:

  #ifdef DM_PARALLEL
  #endif

Solve also contains OpenMP parallelization directives around
tile-loops.

-> Notes:

[1]
``Design of a next-generation weather research and forecast
model,'' with J.Dudhia, D. Gill, J. Klemp, and W. Skamarock, in
proceedings of the Eighth Workshop on the Use of Parallel
Processors in Meteorology, European Center for Medium Range
Weather Forecasting, Reading, U.K., November 16-20, 1998. This is
ANL/MCS preprint number ANL/MCS-P735-1198.  Preprint:
ftp://info.mcs.anl.gov/pub/tech_reports/reports/P735.pdf
Slides:
http://www.mcs.anl.gov/~michalak/ecmwf98/html/index.htm

---------------------------------------------------------------------

NOTES FOR 991217 Version

There is a new version of the WRF code with a little cleanup and some
straightening out of some things --- though it turned out not as much
as I had first hoped.  I originally started at this with the idea of
imposing some directory structure on the code but quickly discovered
that a clean decomposition of the source files into driver, mediation,
and model directories was more complex than I initially thought: there
are some intermodule dependencies that cut across those divisions.  I
still think that this sort of thing is necessary, but for now I've
chickened out.

So that's what I didn't do to the code. What *did* I do?

*. Added a mechanism for more easily specifying the number of tiles
   (see below)

*. Added a 'make clean' target to the top level makefile

*. Added a 'make tar' target to the top level makefile. Creates a wrf.tar
   file in top-level directory. This way you don't have to delete all
   your files when tarring it up to give to others in the group.  The
   list of files to be tarred is set in configure.wrf.

*. Added a couple of new notes to the README

*. Cleaned up the configure.wrf file a little

*. Installed a new version of RSL (with some improved output code)


NEW MECHANISM FOR SPECIFYING NUMBER OF TILES

Originally, we had code that would interrogate
OpenMP to find out how many threads were in use and then arrange for
that many tiles in each patch.  But we want more flexibility than
that.  In particular, we also want to be able to specify tile sizes
based on what we think the right cache-blocking factor for a machine
might be, regardless of the number of threads on each node. So here's
the new deal:

The code now gets the tiling information from the run-time environment,
using the following three environment variables: WRF_NUM_TILES,
WRF_TILE_SZ_X, and WRF_TILE_SZ_Y.

  WRF_NUM_TILES sets the integer number of tiles per a patch and the size of
  the tiles will vary according to the number of points in the local
  subdomain.

  WRF_TILE_SZ_X and WRF_TILE_SZ_Y set the integer size (in cells) of a
  tile in the east-west and north-south directions respectively. When
  these are specified, the number of tiles may vary between patches,
  but the size of each tile is uniform.

Thus, there are two ways of specifying the tiling and they are mutually
exclusive. One may either fix the number of tiles per patch or one may
fix the size of the tiles.  The number specified by WRF_NUM_TILES is
used if WRF_TILE_SZ_X and WRF_TILE_SZ_Y are set less than 1 or not
specified at all.

If WRF_TILE_SZ_X and WRF_TILE_SZ_Y are both set to values of 1 or
greater than the value of WRF_NUM_TILES is ignored and the model will
tile according to fixed sized tiles as specified.

Here are some examples:

1.  setenv WRF_NUM_TILES 4

This will cause the model to generate 4 tiles per patch and the size of
the tiles will vary according to the size (number of cells) in a
patch.  (Patch sizes, and therefore tile sizes, may vary from patch to
patch).

2.  setenv WRF_TILE_SZ_X 4
    setenv WRF_TILE_SZ_Y 4
    setenv WRF_NUM_TILES 4

This will cause the model to generate 4x4-sized tiles the number of
which will be that needed to cover all the cells in a patch.  A few
tiles may be odd sizes to cover remainders in dividing the cells into
tiles. Note that the value of WRF_NUM_TILES is ignored in this case.

3.  setenv WRF_TILE_SZ_X 4
    setenv WRF_NUM_TILES 4

This will cause the model to generate 4 tiles per patch, as in #1,
because the value of WRF_TILE_SZ_Y was not specified.

4.  Default: nothing specified in environment.

I think what this does (or at least what it *should* do) is default to
the number of threads as reported by OpenMP or 1 if no value from
OpenMP.


PASSING ENVIRONMENT VARIABLES TO THE PROGRAM

If you are just running the program non-distributed memory at the
command line then you can set these as you normally would set
environment variables.

For distributed memory runs, where you sometimes have to use a command
like dmpirun (Compaq) or a scheduler, this gets a little tricky. The
code may not inherit your environment on all (or even any) of the
nodes.

When running jobs under load leveler on the IBM at NCAR, you can add the
environment settings to your batch script:

  #!/bin/ksh
  # @ job_type   = parallel
  # @ environment = COPY_ALL;MP_EUILIB=us
  # @ job_name   = wrf
  # @ output     = wrf.out
  # @ error      = wrf.err
  # @ node       = 8,8
  # @ tasks_per_node = 1
  # @ requirements = (Adapter == "hps_user" )
  # @ checkpoint = no
  # @ wall_clock_limit = 10800
  # @ class      = community
  # @ queue
  cd /ptmp/michalak
  /bin/cp /home/babyblue/michalak/991102/src/wrf.exe .
  export WRF_NUM_TILES=2
  timex poe ./wrf.exe
  exit

Note that this is a ksh script and the syntax is a little different from csh.

For running parallel jobs with dmpirun on the Compaq cluster machines, I've found
that the best way to handle this is:

   dmpirun -pf procfile

where procfile contains (for example):

   1 fir-mc6:/fir/users/michalak/991210/src ./runscript
   1 fir-mc7:/fir/users/michalak/991210/src ./runscript

and the run directory contains an excutable shell script, runscript:

   #!/bin/csh
   setenv OMP_NUM_THREADS 4
   setenv WRF_NUM_TILES 4
   setenv WRF_TILE_SZ_X 4
   setenv WRF_TILE_SZ_Y 4
   wrf.exe

which is executed by the dmpirun command on each node you specify. So
in this example, the code would run one patch (MPI process) on each of
2 nodes, fir-mc6 and fir-mc7.  Each of those patches would be tiled
with 4 by 4 sized tiles.  The fact that OMP_NUM_THREADS is also set has
to do with the implementation of OpenMP on the Compaq, not anything to
do with WRF in particular.

--------------------------------------------------------------------------

May 15, 2000

CVS CHECKIN OF IKJ VERSION

--------------------------------------------------------------------------

May 18, 2000

I just did a big commit to the WRF repository. Since it was such a big set of
modifications, I first tagged everything with the tag:

   before_big_commit_by_jm_000518

just so we can back easily to the pre-commit version. Actually, you
could also checkout yesterday's version (5/17/00) and it would
accomplish the same thing. But this new version does work, at least in
non-DM mode.  Haven't checked the DM mode yet.

Here's a summary of what's new:

The configuration/namelist mechanism and data structures have been
cleaned up and rationalized.  There is no longer a bc_flags structure;
it has been renamed to config_flags.  In addition to the original
boundary settings that were carried around in bc_flags, this
config_flags type will also contain physics settings and other scenario
dependent stuff.  Right now it's being set from the namelist as
before.  The definition of the type of config_flags has been moved from
module_bc to module_configure and the actual elements of the type are
generated automatically by the registry, from the namelist (all the
logical boundary flags are in the namelist so you won't see any change
here).  The name of the type is grid_config_rec_type (used to be 'bcs')
and all instances of bc_flags being passed through arg lists in the
model layer routines have been converted.

The config_flags type is available everywhere in the model layer that
it was before, but it is no longer known to the driver layer, except as
a buffer. This is in keeping with trying to get a clean(er) separation
between the model layers and driver layers.  Instead, the driver layer
relies on a series of configuration inquiry routines that are part of
the model layer. The definitions for these routines appear in
module_configure.F and are generated automatically by the registry
mechanism. They may also be called from the model layer, if desired.
They have the form:

  get_wrf_config_<namelist variable name>( <local variable name> )

for scalar namelist variables and:

  get_wrf_config_<namelist variable name>( grid%id , <local variable name> )

for namelist variables that are vectors over the number of domains.

Also along the lines of making the driver-layer/model-layer interface
cleaner, the module_constants file has been split into a driver version
and a model version.  The file module_constants.F no longer exists.
There is now a module_driver_constants.F and a
module_model_constants.F.

JM

--------------------------------------------------------------------------

May 23, 2000

Shu-hua,

I have updated the WRF code in the registry to allow for one of the
modifications that you requested. The field indices into the last index
of the moist and chem arrays that had been defined as parameters in
module_state_description are now variables that are set up in the
registry and specified at run-time from the namelist.  This will allow
for easy mixing an matching of scalar fields with physics packages at
run time, while keeping the fields in contiguous locations in the 4d
arrays.

This involves new semantics for specifying physics "packages" in the
Registry, which in term set up namelist options that can be specified
at run time.

New Registry semantics:
-----------------------

1. Package entry

There is a new type of entry called a "package" and it has the form:

package  <package name>  <namelist choice>  <state vars>  <associated 4d scalars>

  package 

      The keyword 'package' (no quotes) denoting the type of entry in the table

  <package name>

      The name of the package; for example: kessler

  <namelist choice>

      The name of the controlling namelist variable and the setting of
      that variable that specifies the option, separated by '==' and without
      spaces; for example:

               moisture_physics==1

      The controlling namelist variable must be specified with an 'rconfig'
      table entry in the namelist (see below).

  <state variables>

      Unused. This is reserved for specifying state variables that are
      added to the domain data structure only if the option is specified.

  <associated 4d scalars>

      This is a single string, no spaces, that specifies the name(s) of
      the 4d scalar arrays the package uses and the names of the
      associated scalar fields.  The name of the scalar array is
      followed by a colon, and then a comma-separated list of fields.
      For example:

           moist:qv,qr,qc

      If the package uses fields in more than one scalar array, the
      lists are separated by semicolons:

           moist:qv,qr,qc;chem:o3,no2

      Any field listed must have been specified with an entry in the
      Registry state table as having <Dims> ikjf or ikjft and <Use>
      within the indicated scalar array; for example:

      state   real    -              ikjft   moist       2         -
      state   real    qv             ikjft   moist       2         -
      state   real    qc             ikjft   moist       2         -
      state   real    qr             ikjft   moist       2         -

2. Rconfig entry

In addition to the package entry, the controlling namelist variable for
the option must have been specified with an 'rconfig' entry in the
Registry. This allows the package to be turned on or off via the
namelist.  The controlling namelist variable for the package is
specfied the same way as other namelist variables.  In order to
establish the linkage between the package and the namelist variable,
the name of the namelist variable must be the same as that specified as
<namelist choice> in the 'package' entry. For example:

 rconfig  integer  moisture_physics  namelist,namelist_04   max_domains  0

This line specifies an integer namelist variable named moisture_physics
that appears in the namelist_04 block of the namelist file. The
namelist variable is dimensioned over the maximum number of allowable
domains.  The default value is 0 (zero) to indicate no moisture_physics
option is chosen.

Namelist semantics:
-------------------

Once the code has been compiled with this registry information, the
namelist can be used to select between different options for a kind of
package -- for example, the following selects kessler as the moisture
physics option for domain 1 (right now, the only domain):

  &namelist_04
   moisture_physics =   1/

What happens is that at run time, and for each domain, the code will
work out the union of the scalar fields that are needed by all the
packages turned on in the namelist and then set the P_* scalar array
indices to the correct indices into the 4d scalar arrays. (What it does
is assign them in some arbitrary but consecutive order so that there
are no unused fields from index 1 to the index of the last field in the
scalar array).

Interim notes:
--------------

1. Right now the scalar arrays are maximally sized; that will be fixed
shortly so that the scalar arrays are only declared to be large enough
to hold the fields for the packages that are enabled at run-time.

2. There's no real checking in the Registry mechanism to make sure that
the fields specified for a package have been specified ijkf or ikjft.
It will check to make sure that an associated namelist variable has
been specified for the package entry, and will print a warning if it
has not been.

JM

--------------------------------------------------------------------------

June 28, 2000

Added a change to registry mechanism and driver layer data structures
and I/O mechanisms requested to allow multi-level arrays that are
not kde-kds+1 number of levels; in other words, not over the number of
levels in the atmosphere. An example of such a arrays would be a
multi level soil temperature.

Such an array must be 3D or 1-D (in L) and they are specified in the
registry as follows:

  state    real   soil      ilj     -         -         -
  state    real   ZS         l      -         -         -


Note that the dimension specification is 'ilj' rather than 'ikj'.

The l dimension is specified with the new namelist variable
num_soil_layers in namelist01. This may be changed at runtime (default
is 5).  The fact that the model uses this variable to dimension all ilj
dimensioned variables is hard coded.

The number of levels for the ilj arrays is the same for all domains
(that could be changed). And there's only one definition of 'l' in the
model.

It is legal to have multiple time levels for an ilj dimensioned array.

It is legal to have X and Y staggering for an ilj dimensioned
array.  Z staggering is meaningless.

It is legal to have an i1 variable with ilj dimensions.

Cannot specify ilj for any 4D scalar arrays.

Only ilj ordering is supported at this time.

The MM5v3 small header record records the ordering as XYL .

JM

---

June 29, 2000

Added compile -f fast build option, for those times when you only
change one line of a file and you don't want the make mechanism to
rebuild the whole blasted code because you happened to change a module
that everything else depends on.  NOTE: you take resposibility for
those dependencies when you use -f. The option is primarily for minor
changes, such as adding debug print statements in a routine.  If you
make substantive changes to a module, do not use compile -f.

JM

---

June 30, 2000

Added registry semantics and data structures in model for lbc arrays.

LBC arrays are declared in the Registry as:

  state    real   ubdy           kb        -         -         X

The dimension field may be one of kb (for a multilevel LBC), just
b (for a single layer LBC), or lb (for a soil multilevel LBC).  Right
now only kb is supported.

The above example results in the array UBDY being defined in the
domain derived data type (that is, in 'grid') as a four dimensional
array:

  real     ,DIMENSION(:,:,:,:) ,POINTER   :: ubdy

The first dimension is a full (domain sized) horizontal dimension and
it is allocated to be the larger of ide-ids+1 and jde-jds+1.  The
second dimension is the number of levels (kde-kds+1), the third
dimension is the integer spec_bdy_width, defined in the namelist.input
file in the namelist01 block, and the fourth dimension has 4 elements.
The indices into this fourth dimension are defined as integer
parameters in src/module_state_description.F as:

  INTEGER, PARAMETER :: P_WB       = 1      ! west boundary
  INTEGER, PARAMETER :: P_EB       = 2      ! east boundary
  INTEGER, PARAMETER :: P_SB       = 3      ! south boundary
  INTEGER, PARAMETER :: P_NB       = 4      ! north boundary

These can be used in the code to index the desired boundary from the 4D
LBC array.

The LBC arrays are undecomposed over processors.  Each processor will
have a full set of LBCs whether they are used or not. This is because
decomposing arrays that are not fully dimensioned in the 2 horizontal
dimensions is a major headache, and since the arrays are not fully
dimensioned, it's not really a space (nor a memory-scaling) issue.

Ideal.exe will generate dummy LBC fields in the wrfinput file, but it
must be non-DM parallel or single processor DM-parallel to work
correctly.  The wrf.exe program should be able to read in LBC's from
wrfinput, even running DM-paralllel and multi-processor, but this isn't
tested yet.

Note on LBC arrays in model I/O when in MM5v3 Format:

 - The four indices of the last dimension of the LBC array are written
   and read separately from the wrfinput file

 - In the file, the names of the fields are (for example)
   UBDYW, UBDYE, UBDYS, and UBDYN  (the direction is appended
   to the name).

 - The staggering string is YSB or XSB for kb arrays

JM

---

August 14, 2000, revised August 17, 2000

I. Added I/O field to Registry semantics.

Added a new column for state variables called <IO>.   This may be
either '-', 'm', or a string consisting of 'i', 'r', and/or 'h'.  A
hypen means that he field is not subject to I/O.  An 'm' means that the
field is per-dataset metadata.  In the case of the 'irh' string, if 'i'
is present in the string the field is part of data that is used to
initialize WRF.  If 'r' is present in the string, the field is part of
data that is restart data.  If 'h' is present in the string, the field
is part of WRF history I/O.

If '-' or 'm' appear in the I/O column as part of a string that
contains other characters, the '-' or 'm' is ignored.

Within the model code, the routines input_domain and output_domain
(module_io_domain.F) have been removed and replaced by input_initial,
input_restart, input_history, output_initial, output_restart, and
output_history in that same source file.  The input_ and output_initial
routines do I/O for state variables in the Registry that contain an 'i'
in the i/o string.  Likewise, the input_ and output_restart routines
apply to state variables that have an 'r' in the I/O string.  The
input_ and output_history routines apply to state variables with an 'h'
in the I/O string.

A particular state variable can be only 'i', 'r', or 'h' or any
combination of these.

Notes:

 - Only applies to REAL typed fields

 - Only applies to fully dimensioned files (not boundary arrays)

II. Added two new entry types to Registry: halo and period.

The registry now generates the code that specifies interprocessor communication
for halo exchanges and periodic boundary updates.  The format of a halo or
period entry is:

  halo    identifier     size:comma-separated-fieldlist

  period  identifier     bdywidth:comma-separated-fieldlist

Examples:

  halo      HALO_RK_C    4:ru_2,rv_2,du,dv

    This generates a halo exchange descriptor called HALO_RK_C. It is a 
    four-point stencil (ie., one cell in each of n,s,e,w) and the fields
    that are included are ru_2, rv_2, du, dv

  period    PERIOD_BDY_LF_B 1:rtp_1,rtp_2

    This updates one cell of the periodic boundaries of rtp_1 and rtp_2

Notes:

 - ONLY STATE VARIABLES ARE ALLOWED IN THE FIELDLIST
   If an i1 variable is inadvertently listed, the Registry mechanism
   will print a message.

 - It is okay to mix 2 and 3d fields in a fieldlist

 - If a field with multiple time levels is to be exchanged, the time
   level to be exchanged must be specified explicitly.  If both time
   levels are to be communicated, then they must be listed separately.

 - It is okay to have multi-dimensional scalar arrays listed in a
   fieldlist (eg: "halo HALO_LF_MOIST 24:moist_1,moist_2") but one
   cannot list individual elements of a multi-dimensional scalar array
   (eg: cannot list qv).  If a halo exchange is specified for a
   multi-dimensional scalar array, then all the active scalars
   in the array participate in the commmunication.

JM


---

August 28, 2000

Modifications to I/O and implementing restarts.

===Restarts:

1. New namelist variables:

    time_step_count_restart   (int) Number of time steps between restart output
    time_step_begin_restart   (int) Number of time step on coarse domain to begin restart run

2. Naming convention for restarts

    wrfrst_01_000010
           ^^ ^^^^^^------ zero padded timestep number of restart 
           ||                                                
           ++------------- zero padded domain id (always 1)

3. Behavior

The model can now generate periodic restart files and restart itself
from those files. The frequency of restart output is specfied by the
namelist variable time_step_count_restart. This is the number of time steps
between restart output.

To start from a restart file, set the namelist variable time_step_begin_restart
to the time step corresponding to the restart file you wish to start from (that
number will be part of the file name).

For initial runs, time_step_begin_restart must be set to 0.

4. Testing

This has been tested by running the model forward 20 steps and outputting
a restart at the 10 step mark.  Then the model is run from the restart at the
10 step mark. Output at the end (20 step mark) from both runs was compared using
diffv3 and the fields were found to be bit-for-bit identical. Also tested for
DM parallel version on 2 processors. Also bit-for-bit between straight-through
and restart run.

===I/O mods:

1. New namelist variable:

    frames_per_outfile        (int) Number of outputs per history file

2. New naming convention for history files

    wrfout_01_000010
           ^^ ^^^^^^------ zero padded timestep number of first frame in output
           ||                                                
           ++------------- zero padded domain id

3. Description

The history output files are now named according to the domain that
produced the output and the time step at which the output was
produced.  If there are multiple frames of output in the file, the time
step is that of the first frame. The number of frames in an output file
is set by the namelist variable frames_per_outfile. If this set to 1 then
every output produces a new file.

===Misc:

Removed from namelist i_mother_end, j_mother_end.  These are not needed to specify nest
location.

-------

September 5, 2000

Specifying the number/size of shared-memory tiles in WRF.  New namelist variables in 
namelist01:  numtiles, tile_sz_x, and tile_sz_y.

     ! Here's how the number of tiles is arrived at:
     !
     !          if tile sizes are specified in namelist use those, otherwise
     !          if numtiles is specified in namelist use that, otherwise
     !          if OpenMP provides a value use that, otherwise
     !          use 1.
     !

No longer using environment variables WRF_NUM_TILES.

--------

September 6, 2000

Added DNAME, DESCRIPTION, and UNITS string fields to the Registry.
This allows the name of a variable in the data set to be different from
the name of the variable in the model. Also provides a means for
including other metadata (namely: description and units).  Note that
these fields are allowed to have spaces in them, but if they do they
must contain quotes.

--------

September 8, 2000

Modified set_tiles (module_tiles.F) so that the programmer can
underspecify the domain (avoid boundaries) or overspecify the patch
(compute onto halos).  The call is now:

CALL set_tiles ( grid , ids , ide , jds , jde , ips , ipe , jps , jpe )
                        ^^^^^^^^^^^^^^^^^^^^^   ^^^^^^^^^^^^^^^^^^^^^
                limit the size of domain to     widen the patch to
                iterate over                    iterate over

There is a second form of the routine is used to compute only
over a specified boundary region:

CALL set_tiles ( grid , ids , ide , jds , jde , bdy_width )
                        ^^^^^^^^^^^^^^^^^^^^^
                        region containing
                        boundary

For additional information please see:

 http://www.mmm.ucar.edu/mm5/mpp/settiles

CAUTION: all of the grid%i_start, etc. arrays are set this way until
the next call to set_tiles (so if you limit the domain or increase
the patch) make sure you set it back again afterwards).

--------

October 4, 2000

A big house-keep on the WRFMODEL code checked in (after testing with
b_wave on she, both the non-dm and dm-paral lel compiles). Here are the
highlights:

1) Went through an removed every stop statement, except the one at the
end of wrf.F, and replaced with a call to wrf_error_fatal, along with
an appropriate string message. I tried to come as close as possible to
the original error message that was there. In the future, if you want
to print an advisory, print debug info, or print an er ror message and
stop, please use the following forms:

- Advisory/warning with a string constant:

  CALL wrf_message ( 'module_name: routine_name: message' )

- Advisory/warning with a generated string:

  WRITE ( wrf_err_message , * ) 'module_name: routine_name: the value of blah is ',blah
  CALL wrf_message ( TRIM( wrf_err_message ) )

- Debug message (note debug level associated: this is controlled from namelist):

  CALL wrf_debug ( 100 , 'module_name: routine_name: debug message' )

     -or-

  WRITE ( wrf_err_message , * ) 'module_name: routine_name: debug the value of blah is ',blah
  CALL wrf_debug ( 100 , TRIM( wrf_err_message ) )

- Fatal error with a string constant:

  CALL wrf_error_fatal ( 'module_name: routine_name: goodbye cruel world)

- Fatal error with a generated string:

  WRITE ( wrf_err_message , * ) 'module_name: routine_name: z is ',z,'. Adieu.'
  CALL wrf_error_fatal ( TRIM( wrf_err_message ) )

The string variable wrf_err_message is defined in module_wrf_error
along with the wrf_message, wrf_debug, and wr f_error_fatal routines.
In the case of wrf_error_fatal, it will also kill the model run (so
there's no need for a stop statement). The advantage of this is that
the driver layer (of which module_wrf_error is a part) can make the
decision as to what get's printed, when and from which processor, as
well as properly shutting down the cod e in the event of a fatal error
(a bare stop might only kill the processor that generated the error and
leave th e others hanging).

2) All the stuff related to the Leapfrog option is gone.

3) All the stuff related to NCAR graphics and quick_output is gone.

4) All the names that used to have rk2 in them now have just rk.

--------

October 6, 2000

NOTES ON RUNNING WRF FOR DISTRIBUTED MEMORY

Compaq Clusters:  (she.mmm.ucar.edu,, fir.mmm.ucar.edu)

1. Configure the code using:

    "Compaq DM/SM    (RSL, DECMPI, RSL IO, OpenMP)" (currently option 13)

2. compile wrf

3. cd to run directory (or test case directory)

4. edit namelist as needed

4. mpiclean

5. dmpirun -np 1 ideal.exe    (or ideal_<testcase>.exe)

   Note output will be written to a file rsl.out.0000
   Note error output will be written to a file rsl.error.0000

6. mpiclean

7. dmpirun -np 4 wrf.exe

   Note output will be written to a files rsl.out.0000 through 0003
   Note error output will be written to a files rsl.error.0000 through 0003

