.. role:: definition
    :class: definition

==================================
Configuring Model Input and Output
==================================

|

|

The reading and writing of model fields in MPAS is handled by user-configurable *streams*.

..  container:: row m-0 p-0

    ..  container:: col-md-12 pl-0 pr-3 py-3 m-0

        ..  container:: card px-0 h-100

            ..  rst-class:: card-header-def h4

                ..  rubric:: Stream

            ..  container:: card-body-def

                A fixed set of model fields with dimensions and attributes written or read together to or from the same file or set of files.

|

Each MPAS model core may define its own set of default streams that it uses to read initial conditions, read and write restart fields, and write additional model history fields. Besides these default streams, users may define new streams to, e.g., write certain diagnostic fields at a higher temporal frequency than the usual model history fields.

Streams are defined in the XML configuration files (*streams*, suffixed with the core name) created at build time for each model core. For example, 

*   **streams.atmosphere**
       the streams for the "atmosphere" core
*   **streams.init_atmosphere**
       the streams for the "init_atmosphere" core

|

An XML stream file may further reference other text files that contain lists of the model fields that are read or written in each of the streams defined in the XML stream file.

**There is no need to re-compile after making modifications to the XML files.** Changes to the XML stream configuration file take effect the next time an MPAS core is run. It is therefore possible, for e.g., to change the interval at which a stream is written, the template for the filenames associated with a stream, or the set of fields that are written to a stream, without the need to re-compile any code. This is described further in the next section.

|

|

**MPAS includes two classes of streams:** *Immutable streams* and *Mutable streams*

..  container:: row m-0 p-0

    ..  container:: col-md-12 pl-0 pr-3 py-3 m-0

        ..  container:: card px-0 h-100

            ..  rst-class:: card-header-def h4

                ..  rubric:: Immutable streams

            ..  container:: card-body-def

                Those for which the set of fields that belong to the stream may not be modified at model run-time; however, it is possible to modify the interval at which the stream is read or written, the filename template describing the files containing the stream on disk, and several other parameters of the stream. 


    ..  container:: col-md-12 pl-0 pr-3 py-3 m-0

        ..  container:: card px-0 h-100

            ..  rst-class:: card-header-def h4

                ..  rubric:: Mutable streams

            ..  container:: card-body-def

                Streams for which all aspects - including the set of fields that belong to the stream - may be modified at run-time. 
  
|

.. note::
   Two stream classes are created because is the idea that an MPAS core may not function correctly if certain fields are not read in upon model start-up or written to restart files, and it is therefore not reasonable for users to modify this set of required fields at run-time. An MPAS core developer may choose to implement such streams as immutable streams. Since fields may not be added to an immutable stream at run-time, new immutable streams may not be defined at run-time, and the only type of new stream that may be defined at run-time is the mutable stream type. 

|

|

|

XML Stream Configuration Files
==============================

The XML stream configuration file for an MPAS core always has a parent XML element named *streams*, within which individual streams are defined:

.. code-block::

   <streams>

       ... one or more stream definitions ...

    </streams>

|

|

Immutable streams are defined with the *immutable_stream* element, and mutable streams are defined with the *stream* element:

.. code-block::

   <immutable_stream name="initial_conditions"
                     type="input"
                     filename_template="init.nc"
                     input_interval="initial_only"
                     />

   <stream name="history"
                     type="output"
                     filename_template="output.$Y-$M-$D_$h.$m.$s.nc"
                     output_interval="6:00:00" >

                ... model fields belonging to this stream ...
   </stream>

|

|

|

As shown in the example stream definitions above, **both stream classes have the following required attributes:**

.. csv-table::
   :escape: \
   :widths: 20, 60      
   :width: 80%

   **name**, A unique name used to refer to the stream
   **type**, The stream type\, either *"input"*\, *"output"*\, *"input;output"*\, or *"none"*. A stream may be declared  *"input;output"* if\, for example\, it is read once at model start-up to provide initial conditions and thereafter written periodically to provide model checkpoints. A stream may be defined as *"none"* when defining a set of fields for including other streams. The *type* attribute for immutable streams may not be changed at run-time. 
   **filename_template**, The template for files that exist or will be created by the stream. This may include any of the following variables\, which are expanded based on the simulated time at which files are first created. |br| |br| $Y : Year |br| $M : Month |br| $D : Day of the month |br| $h : Hour |br| $m : Minute |br| $s: Second |br| |br| A *filename_template* may include either a relative or absolute path\, in which case MPAS attempts to create any directories in the path that do not exist\, subject to filesystem permissions.
   **input_interval**, For streams that have type *"input"* or *"input;output"*\, the interval\, beginning at the model initial time\, at which the stream will be read. Possible values include a time interval in the format *"YYYY-MM-DD_hh:mm:ss"*; *"initial_only"*\, which specifies that the stream is read-only once at the model initial time; or *"none"*\, specifying that the stream is not read during a model run.
   **output_interval**, For streams that have type *"output"* or *"input;output"*\, the interval\, beginning at the model initial time\, at which the stream will be written. Possible values include a time interval in the format *"YYYY-MM-DD_hh:mm:ss"*; *"initial_only"*\, which specifies that the stream is written only once at the model initial time; or *"none"*\, which specifies that the stream is not written during a model run.

|

|

**The set of fields that belong to a mutable stream may be specified with any combination of the following elements.** Note that, for immutable streams, no fields are specified at run-time in the XML configuration file.

.. csv-table::
   :escape: \
   :widths: 20, 60
   :width: 80%

   **var**, Associates the specified variable with the stream. The variable may be any defined in an MPAS core's *Registry.xml* file\, but may not include individual constituent arrays from a *var_array*.
   **var_array**, Associates all constituent variables in a *var_array*\, defined in an MPAS core's *Registry.xml* file\, with the stream.
   **var_struct**, Associates all variables in a *var_struct*\, defined in an MPAS core's *Registry.xml* file\, with the stream.
   **stream**, Associates all explicitly associated fields in the specified stream with the stream; streams are not recursively included.
   **file**, Associates all variables listed in the specified text file\, with one field per line\, with the stream.

|

|

|

.. _Optional Stream Attributes:

Optional Stream Attributes
==========================

Besides the required attributes described in the preceding section, several **additional, optional attributes may be added to the definition of a stream.**

.. csv-table::
   :escape: \
   :widths: 20, 60
   :width: 80%

   **filename_interval**, The interval between timestamps used in the construction of the names of files associated with a stream. Possible values include: |br| |br| *   A time interval specification in the format *"YYYY-MM-DD_hh:mm:ss"* |br| *   "none" : indicates that only one file containing all times is associated with the stream |br| *   "input_interval" : for input type streams\, indicates that each time to be read from the stream will come from a unique file |br| *   "output_interval" : for output type streams\, indicates that each time written to the stream will go to a unique file whose name is based on the timestamp of the data being written. |br| |br| The default value is *"input_interval"* for input type streams and *"output_interval"* for output type streams. For streams of type *"input;output"*\, the default filename interval is *"input_interval"* if the input interval is an interval (i.e.\, not *"initial_only"*)\, or *"output_interval"* otherwise. See :ref:`Example 1`.
   **reference_time**, A time that is an integral number of filename intervals from the timestamp of any file associated with the stream. The default value is the start time of the model simulation. See :ref:`Example 3`.
   **clobber_mode**, Specifies how a stream should handle attempts to write to a file that already exists. Possible values for the mode include: |br| |br| *   *"overwrite"* : The stream is allowed to overwrite records in existing files and to append new records to existing files; records not explicitly written to are left untouched. |br| *   *"truncate"* or *"replace_files"* : The stream is allowed to overwrite existing files\, which are first truncated to remove any existing records; this is equivalent to replacing any existing files with newly created files of the same name. |br| *   *"append"* : The stream is only allowed to append new records to existing files; existing records may not be overwritten. |br| *   *"never_modify"* : The stream is not allowed to modify existing files in any way. |br| |br| The default clobber mode for streams is *"never_modify"*. See :ref:`Example 2`.
   **precision**,  The precision with which real-valued fields will be written or read in a stream. Possible values include: |br| |br| *   *"single"* : for 4-byte real values |br| *   *"double"* : for 8-byte real values |br| *   *"native"* : specifies that real-valued fields will be written or read in whatever precision the MPAS core was compiled |br| |br| The default value is *"native"*. See :ref:`Example 1`.
   **packages**, A list of packages attached to the stream. A stream will be active (i.e.\, read or written) only if at least one of the packages attached to it is active\, or if no packages at all are attached. Package names are provided as a semi-colon-separated list. Note that packages may only be defined in an MPAS core's Registry.xml file at build time. By default\, no packages are attached to a stream.
   **io_type**, The underlying library and file format that will be used to read or write a stream. Possible values include: |br| |br| *"pnetcdf"* : Read/write the stream with classic large-file NetCDF files (CDF-2) using the ANL Parallel-NetCDF library |br| *   *"pnetcdf\,cdf5"* : Read/write the stream with large-variable files (CDF-5) using the ANL Parallel-NetCDF library |br| *   *"netcdf"* : Read/write the stream with classic large-file NetCDF files (CDF-2) using the Unidata serial NetCDF library |br| *   *"netcdf4"* : Read/write the stream with HDF-5 files using the Unidata parallel NetCDF-4 library |br| |br| Note that the PIO library must have been built with support for the selected *io_type*. By default\, all input and output streams are read and written using the *"pnetcdf"* option.

|

|

|

.. _Stream Definition Examples:

Stream Definition Examples
==========================

|

This section provides several example streams that make use of the `Optional Stream Attributes`_. All examples are of output streams since it is more likely to need to write additional fields, than read additional fields, which a model would need to be aware of; however, the concepts illustrated here translate directly to input streams, as well.

|

|

.. _Example 1:

Single-precision Output - 1 Month Data Per File
-----------------------------------------------

The optional attribute specification *filename_interval="01-00_00:00:00"* is added to force a new output file to be created for the stream every month. The general format for time interval specifications is *YYYY-MM-DD_hh:mm:ss*, where any leading terms can be omitted; here, the year is omitted. To reduce file size, *precision="single"* is added to force real-valued fields to be written as 4-byte floating-point values, rather than the default of 8 bytes.

.. code-block::

   <stream name="diagnostics"
           type="output"
           filename_template="diagnostics.$Y-$M.nc" 
           filename_interval="01-00_00:00:00" 
           precision="single"
           output_interval="6:00:00" >
     
        <var name="u10"/>
        <var name="v10"/>
        <var name="t2"/>
        <var name="q2"/>
   </stream>

|

The only fields that will be written to this stream are the hypothetical 10-m diagnosed wind components, the 2-m temperature, and the 2-m specific humidity variables. 

.. note::
   The filename template only includes the year and month from the model valid time; this can be problematic when the simulation starts in the middle of a month, and a solution for this problem is illustrated in the *Example 3*, below.

|

|

.. _Example 2:

Append Records to Existing Output
---------------------------------

By default, streams never modify existing files whose filenames match the name of a file that would otherwise be written during the course of a simulation. However, when restarting a simulation that will add more records to existing output files, it can be useful to instruct the MPAS I/O system to append these records, thereby modifying existing files. This is accomplished with the *clobber_mode* attribute.

.. code-block::

   <stream name="diagnostics"
           type="output"
           filename_template="diagnostics.$Y-$M.nc"
           filename_interval="01-00_00:00:00" 
           precision="single"
           clobber_mode="append"
           output_interval="6:00:00" >

        <var name="u10"/>
        <var name="v10"/>
        <var name="t2"/>
        <var name="q2"/>
   </stream>

|

In general, if MPAS attempts to write a record at a time that already exists in an output file, a *clobber_mode* of *"append"* does not permit the write, since this would modify existing data; in *"append"* mode, only new records may be added. 

.. warning::
   Due to a peculiarity in the implementation of the *"append"* clobber mode, it may be possible for an output file to contain duplicate times. This can happen when the first record appended to an existing file has a timestamp not matching any in the file, after which, any record written - regardless of whether its timestamp matches one already in the file - will be appended to the end of the file. This situation may arise, for example, when restarting a model simulation with a shorter *output_interval* than was used in the original model simulation with an MPAS core that does not write the first output time for restart runs.

|

|

.. _Example 3:

Reference Filename Intervals to a Time Other Than the Start Time
----------------------------------------------------------------

The previous example creates a new file each month during the simulation, and the filenames contain only the year and month of the timestamp when the file was created. If a simulation begins at 00 UTC on the first day of a month, then each file in the diagnostic stream will contain only output times that fall within the month in the filename. However, if a simulation begins in the middle of a month - for example, June, 2014 - the first diagnostics output file would have a filename of *diagnostics.2014-06.nc*, but rather than containing only output fields valid in June, it would contain all fields written between the middle of June and the middle of July, at which point one month of simulation would have elapsed, and a new output file, *diagnostics.2014-07.nc*, would be created.

In order to ensure that the file *diagnostics.2014-06.nc* contained only data from June 2014, the *reference_time* attribute may be added such that the day, hour, minute, and second in the date and time represent the first day of the month at 00 UTC. In this example, the year and month of the reference time are not important, since the purpose of the reference time here is to describe to MPAS that the monthly filename interval begins (i.e., is referenced to) the first day of the month.

.. code-block::

        <stream name="diagnostics"
                type="output"
                filename_template="diagnostics.$Y-$M.nc" 
                filename_interval="01-00_00:00:00" 
                reference_time="2014-01-01_00:00:00" 
                precision="single"
                clobber_mode="append" 
                output_interval="6:00:00" >

             <var name="u10"/>
             <var name="v10"/>
             <var name="t2"/>
             <var name="q2"/>

        </stream>

|

In general, the components of a timestamp, *YYYY-MM-DD_hh:mm:ss*, that are less significant than (i.e., to the right of) those contained in a filename template are important in a reference time. For example, with a *filename_template* that contained only the year, the month component of the *reference_time* would become important to identify the month of the year on which the yearly basis for filenames would begin.

|

|

|

|

|

.. rst-class:: horizbuttons-next-m

* `Next: Physics Suites -> <./phys_suites.html>`_

|