Sender: wesley@sgi.com
Date: Mon, 19 Jun 2000 14:42:34 -0700
From: Wesley Jones <wesley@sgi.com>
To: "John G. Michalakes" <michalak@ucar.edu>, toigo@gps.caltech.edu
Subject: Re: Fwd: Question on an MPI error

Hi Anthony,

	Per the MPI man page, "man mpi," MPI_MSGS_PER_PROC is an environment
variable.

     MPI_MSGS_PER_PROC
          Sets the maximum number of message headers to be allocated from
          sending process space for outbound messages going to the same
          host. (This variable might be required by standard-compliant
          programs.) MPI allocates buffer space for local messages based on
          the message destination. Space for messages that are destined for
          local processes is allocated as additional process space for the
          sending process.

          Default:  1024

	A large number of message headers is only required at the very beginning
of MM5 where is it is distributing a bunch of information to the rest of the 
MPI ranks via a call to broadcast.  You should be able to overcome the problem
by csh "setenv MPI_MSGS_PER_PROC 4096", ksh "export MPI_MSGS_PER_PROC=4096"

	You might want to try setting the MPI_STATS environment variable and
see if you have any RETRIES.  If you do you can try and increase the 
appropriate buffer by looking at the end of the error output file where the
information about retries will be located and looking at the man page.  After
testing for RETRIES you will want to unset the MPI_STATS environment variable.

	Let me know if you have other problems,

	WEs
	
"John G. Michalakes" wrote:
> 
> Wes, do these error messages coming out of the SGI version of MPI make any
> sense to you? Any ideas about what to do?
> 
> John
> 
> >From: Anthony Toigo <toigo@gps.caltech.edu>
> >Date: Sun, 18 Jun 2000 22:57:30 -0700 (PDT)
> >To: "John G. Michalakes" <michalak@ucar.edu>
> >Subject: Question on an MPI error
> >X-Mailer: VM 6.43 under 20.4 "Emerald" XEmacs  Lucid
> >Reply-To: toigo@gps.caltech.edu
> >
   . . . .
> >
> >I tried taking your advice and running my modified model with as few
> >modifications as possible, and I came up with a strange error.  I was
> >wondering if you could identify it for me.
> >
> >Although the question is simple, the setup is a little long:
> >
> >My test run is (up to) 4 domains, with the following settings (digested
> >from the deck file, extra values removed for ease of reading):
> >
> >LEVIDN =   0,1,2,3,               ; level of nest for each domain
> >NUMNC  =   1,1,2,3,               ; ID of mother domain for each nest
> >NESTIX =  72,  76,  76,  76,  ; domain size i
> >NESTJX =  72,  76,  76,  76,  ; domain size j
> >NESTI  =   1,  24,  26,  26,  ; start location i
> >NESTJ  =   1,  24,  26,  26,  ; start location i
> >
> >Using an SGI Origin 2000, I got the code to run successfully with one and
> >two domains.  However, when I switched to 3 or 4 domains, I got the
> >following errors:
> >
> >
> >% timex mpirun -v -np 64 ./mm5.mpp
> >MPI: libxmpi.so 'SGI MPI 3.2.0.7        01/31/00 12:48:41'
> >MPI: libmpi.so  'SGI MPI 3.2.0.7        01/31/00 11:47:23      (N32_M4)'
> >MPI: MPI_MSGS_PER_HOST= 0
> >MPI: MPI_MSGS_PER_PROC= 1024
> >MPI: MPI_MSG_RETRIES= 500
> >MPI: MPI_BUFS_PER_HOST= 0
> >MPI: MPI_BUFS_PER_PROC= 32
> >MPI: MPI_MSG_LISTS= 8
> >MPI: MPI_BUF_LISTS= 0
> >.
> >. (allocated processor chatter removed)
> >.
> >MPI: MPI_COMM_WORLD rank 0 has terminated without calling MPI_Finalize()
> >MPI: aborting job
> >
> >
> >
> >Looking in the rsl.error.0000 file:
> >
> >% more rsl.error.0000
> >*** MPI has run out of PER_PROC message headers.
> >*** The current allocation levels are:
> >***     MPI_MSGS_PER_HOST = 0
> >***     MPI_MSGS_PER_PROC = 1024
> >***     MPI_MSG_RETRIES   = 500
> >IOT Trap
> >
> >
> >
> >
> >Finally, I get a core file that I don't usually get on the Beowulf clusters:
> >
> >
> >The core file reports were different for the 3 and 4 domain runs, although
> >all of the above information (output from the mpirun command and the
> >contents of rsl.error.0000) were identical.
> >
> >For 3 domains:
> >
> >% cvdump mm5.mpp core.277591
> >Executable: /tmp/toigo/MM5/Run/mm5.mpp
> >Core file: /tmp/toigo/MM5/Run/core.277591
> >Core from signal SIGABRT: Abort (see abort(3c))
> >=========================================
> >
> >_kill(<stripped>) ["kill.s":15, 0x0fad4928]
> >_raise(<stripped>) ["raise.c":27, 0x0fad52a4]
> >abort(<stripped>) ["abort.c":52, 0x0fa3de40]
> >sigdie(<stripped>) ["main.c":156, 0x0ad99d24]
> >sigidie(<stripped>) ["main.c":117, 0x0ad99c00]
> >_sigtramp(<stripped>) ["sigtramp.s":71, 0x0fad4dcc]
> >_kill(<stripped>) ["kill.s":15, 0x0fad4928]
> >_raise(<stripped>) ["raise.c":27, 0x0fad52a4]
> >abort(<stripped>) ["abort.c":44, 0x0fa3de0c]
> >MPI_SGI_request_send(<stripped>) ["req.c":235, 0x030c8968]
> >PMPI_Send(<stripped>) ["send.c":82, 0x03102734]
> >rsl_mon_bcast_(<stripped>) ["rsl_mon_bcast.c":131, 0x100de070]
> >DM_BCAST_INTEGERS(<stripped>) ["dm_io.f":249, 0x10089108]
> >PARAM(<stripped>) ["param.f":1437, 0x10030910]
> >MM5(<stripped>) ["mm5.f":895, 0x1001c704]
> >main(<stripped>) ["main.c":97, 0x0ad99b20]
> >__start(<stripped>) ["crt1text.s":177, 0x100086e8]
> >
> >where param.f:1437 is:
> >         CALL DM_BCAST_INTEGERS(START_INDEX,4)
> >
> >and dm_io.f:249 is:
> >        CALL RSL_MON_BCAST( BUF, N*4 )
> >in:
> >        SUBROUTINE DM_BCAST_INTEGERS( BUF, N )
> >        IMPLICIT NONE
> >        INTEGER BUF(*)
> >        INTEGER N
> >        CALL RSL_MON_BCAST( BUF, N*4 )
> >        RETURN
> >        END
> >
> >
> >
> >For 4 domains, it stopped on a different part of param.f:
> >
> >% cvdump mm5.mpp core.273381
> >Executable: /tmp/toigo/MM5/Run/mm5.mpp
> >Core file: /tmp/toigo/MM5/Run/core.273381
> >Core from signal SIGABRT: Abort (see abort(3c))
> >=========================================
> >
> >_kill(<stripped>) ["kill.s":15, 0x0fad4928]
> >_raise(<stripped>) ["raise.c":27, 0x0fad52a4]
> >abort(<stripped>) ["abort.c":52, 0x0fa3de40]
> >sigdie(<stripped>) ["main.c":156, 0x0ad99d24]
> >sigidie(<stripped>) ["main.c":117, 0x0ad99c00]
> >_sigtramp(<stripped>) ["sigtramp.s":71, 0x0fad4dcc]
> >_kill(<stripped>) ["kill.s":15, 0x0fad4928]
> >_raise(<stripped>) ["raise.c":27, 0x0fad52a4]
> >abort(<stripped>) ["abort.c":44, 0x0fa3de0c]
> >MPI_SGI_request_send(<stripped>) ["req.c":235, 0x030c8968]
> >PMPI_Send(<stripped>) ["send.c":82, 0x03102734]
> >rsl_mon_bcast_(<stripped>) ["rsl_mon_bcast.c":131, 0x100de5e0]
> >DM_BCAST_STRING(<stripped>) ["dm_io.f":287, 0x10089880]
> >PARAM(<stripped>) ["param.f":1440, 0x10030e60]
> >MM5(<stripped>) ["mm5.f":895, 0x1001cb9c]
> >main(<stripped>) ["main.c":97, 0x0ad99b20]
> >__start(<stripped>) ["crt1text.s":177, 0x10008708]
> >
> >Line 1440 of param.f is:
> >         CALL DM_BCAST_STRING(STAGGERING,4)
> >
> >Line 287 of dm_io.f is:
> >          CALL RSL_MON_BCAST( IBUF, N*4 )
> >in:
> >        SUBROUTINE DM_BCAST_STRING( BUF, N )
> >        IMPLICIT NONE
> >        INTEGER N
> >        CHARACTER*(*) BUF
> >        INTEGER IBUF(256),I
> >        IF (N .GT. 256) N = 256
> >        IF (N .GT. 0 ) THEN
> >          DO I = 1, N
> >            IBUF(I) = ICHAR(BUF(I:I))
> >          ENDDO
> >          CALL RSL_MON_BCAST( IBUF, N*4 )
> >          DO I = 1, N
> >            BUF(I:I) = CHAR(IBUF(I))
> >          ENDDO
> >        ENDIF
> >        RETURN
> >        END
> >
> >and I guess the rest you would know ...
> >
> >
> >
> >My question is: is this some error caused by my modifications or is it
> >something about setting up parallel runs that I don't understand?
> >
> >I'm hoping it's a set-up error, since I got it to work with one or two
> >domains.  If you need any further information, please let me know.
> >
> >Once again, thank you very much for taking the time to help me out.
> >
> >Anthony Toigo
> 
>   ----------------------------------------------------------------------
>   John Michalakes,  michalak@ucar.edu,  http://www.mcs.anl.gov/~michalak
>   ----------------------------------------------------------------------
>   MCS Division                | MMM Division
>   Argonne National Laboratory | National Center for Atmospheric Research
>                               | 3450 Mitchell Lane, Boulder, CO 80301
>                               | 303-497-8199
>   ----------------------------------------------------------------------

-- 
Wesley B. Jones, PhD                    wesley@sgi.com
SGI, Boulder, CO                        Phone: (303)-448-1165
Performance Engineering                 FAX: