Gentle users,

A bug was reported in the MPP restart mechanism having to do with the timing of restart output. A description of the problem, reported by Norm Henry at the Meteorological Service in New Zealand, is followed by the fix. The fix will be incorporated in the next release of MM5.

Rotang, Aug 20, 2001



> -----Original Message-----
> From: Norm Henry [mailto:Norm.Henry@met.co.nz]
> Sent: Wednesday, August 15, 2001 11:27 PM
> To: 'michalak@ucar.edu'
> Subject: Stop/restart problem with MPP
> 
> 
> Hi John,
> 
> I've been having a problem with stop/restart MPP runs failing at the 
> beginning of the restart. I found this with v3.3 and have just installed 
> v3.4 and found the same thing. Attached are rsl.out files for 
> both the stop 
> and restart.
> 
> The namelist file for this run specifies a stop at 1080 min. The coarse 
> domain time step is 108 sec = 1.8 min. From what I see at the end of the 
> rsl_stop file, the model seems to want to carry on for an extra timestep 
> beyond 1080 - hence the XTIMR variable is at 1081.8. The restart file 
> itself is correctly named -
> 
> 	./restrts/r-01-0001080-0000
> 
> but MM5 fails on restart when it reads the file and finds that XTIMR is
> not what it expected. This seems to be specific to MPP as we haven't
> seen this problem for SMP stop/restarts.
> 
> Any suggestions?!
> 
> Thanks in advance...
> 
> Norm Henry
> Meteorological Service of New Zealand 

To: Norm Henry
Subject: RE: Stop/restart problem with MPP

Dear Norm,

It appears this is something that got updated in the non-MPP v3.4 code
last November but not in the corresponding MPP code. The non-MPP
version of domain/io/output.F was modified:

C                                                                                OUTPUT.37
C-----OUTPUT FOR RESTART:                                                        OUTPUT.38
C                                                                                OUTPUT.39
#ifndef MPP1                                                                     OUTPUT.40
      IF((.NOT.IFSAVE).OR.(KTAU.LT.NINT(SAVTIM/DTMIN)))GOTO 80     <<<<<<<       07NOV00.366
      DO 66 LLN=1,MAXNES                                                         OUTPUT.42
        IF(IACTIV(LLN).EQ.1)THEN                                                 OUTPUT.43
C     RESET FINE MESH XTIME TO COARSE (CORRECTS TRUNCATION DRIFT)                OUTPUT.44
          XTIME=XTIMC                                              <<<<<<<       OUTPUT.45
          IUTSAV=51+(LLN-1)                                                      OUTPUT.46


But the MPP code still has what appears to be older code to determine
whether to write a restart (07NOV00.366) and it is also missing the
code to reset the fine mesh XTIME to the coarse XTIME (OUTPUT.45).  The
following is in the file MPP/RSL/mpp_output_10.incl):

CSTART   mpp_output_10.incl
      IF((.NOT.IFSAVE).OR.(XTIME.LT.SAVTIM))GOTO 80             <<<<<<<<<
      PRINT *,' TOTAL NUMBER OF DOMAINS POSSIBLE ON RESTART',
     *        ' OUTPUT IS ',NSTTOT
      DO 66 LLN=1,MAXNES
        IF(IACTIV(LLN).EQ.1)THEN                                <<<<<<<<<
          IUTSAV=51+(LLN-1)
          IF (SVLAST)THEN
            WRITE(SAVENAME,1000)LLN,0,RSL_MYPROC
          ELSE
            WRITE(SAVENAME,1000)LLN,IFIX(XTIME+.001),RSL_MYPROC
          ENDIF
 1000     FORMAT('./restrts/r-',I2.2,'-',I7.7,'-',I4.4)
          OPEN (IUTSAV,FILE=SAVENAME,FORM="UNFORMATTED",ERR=64)
          CALL OUTSAV(IUTSAV,ALLARR(1,LLN),IRHUGE,INTALL(1,LLN),IIHUGE,
     +         ALLFG(1,LLN),IFGHUG,INTFG(1,LLN),IFGIHUG,ALLFO(1,LLN),
     +         IFOHUG,MAXNES,MKX,MIX,MJX)
          PRINT *,'     +++   OUTPUTTING DOMAIN ',LLN,
     *              '  OUTPUT FOR THIS DOMAIN WAS UNIT ',IUTSAV
          CLOSE ( IUTSAV )
          GOTO 65
   64     CONTINUE
          PRINT*,'ERROR OPENING ',SAVENAME,' FOR RESTART.  CONTINUING.'
   65     CONTINUE
        ENDIF

[The fix is to modify the indicated lines in the mpp_output_10.incl file,
above, to match the lines in domain/io/output.F. -Rotang's note]

Thanks for reporting this.

John