Gentle users,
This entry describes the fix for a problem that was encountered by a user of a Solaris distributed memory system. The original message to the helpdesk, with some additional detail about the problem, appears after the reply.
-Rotang, April 25, 2001
Date: Wed, 25 Apr 2001 16:06:11 -0600 (MDT) From: John MichalakesTo: Michael.Walters@afit.edu Cc: michalak@ucar.edu Subject: RE: MPP on Sun Michael, It turns out that the seg-fault is coming from the Solaris verison of the 'sort' command in the FLIC script. First the fix, then a bug report you can hand off to Sun if you want. 1) Fix In the file MPP/FLIC/FLIC/flic.csh, add the line indicated by the comment with the word 'kludge' in it: if ( $s1_nocomments != yes ) then $FGREP TCOMMENT $TMP/flic_scanned.$$ |\ $AWK -F: '{print $4}' > $TMP/flic_cnum.$$ $FGREP TCOMMENT $TMP/flic_scanned.$$ |\ $SED 's/^.*TCOMMENT://' > $TMP/flic_coms2.$$ $PASTE $TMP/flic_cnum.$$ $TMP/flic_coms2.$$ > $TMP/flic_coms.$$ $HARDRM $TMP/flic_cnum.$$ $TMP/flic_coms2.$$ # kludge for Solaris sort command that segfaults if TMP/flic_coms is zero length echo " " >> $TMP/flic_coms.$$ # $SORT -nm +0 -1 $TMP/flic_dat.$$ $TMP/flic_coms.$$ | $CUT -f2- | \ $REASSEMBLE $TMP/bbb.$$ | $SED 's/CFLICBYE //' else $CUT -f2- $TMP/flic_dat.$$ | $SED 's/CFLICBYE //' endif After you make this change, you will need to 'make uninstall' and then 'make mpp' again, so that FLIC is completely rebuilt. 2) Bug report for Sun This is a log of a terminal session on your system that demonstrates the bug in sort: ------------------------------------------------------ % cat > flic_dat 1 SUBROUTINE KFBMDATA #include #include FLIC_RUN_DECL 2 COMMON /VAPPRS/ALIQ,BLIQ,CLIQ,DLIQ,AICE,BICE,CICE,DICE,XLS0,XLS1 3 DATA ALIQ,BLIQ,CLIQ,DLIQ/613.3,17.502,4780.8,32.19/ 4 DATA AICE,BICE,CICE,DICE/613.2,22.452,6133.0,0.61/ 5 DATA XLS0,XLS1/2.905E6,259.532/ CFLIC END DECLARATIONS 6 RETURN 7 END % touch flic_coms % ls -l flic_dat flic_coms -rw-rw-r-- 1 jmichala staff 0 Apr 25 18:03 flic_coms -rw-rw-r-- 1 jmichala staff 414 Apr 25 18:03 flic_dat % sort -nm +0 -1 flic_dat flic_coms Segmentation fault % echo " " >> flic_coms % sort -nm +0 -1 flic_dat flic_coms 1 SUBROUTINE KFBMDATA #include #include FLIC_RUN_DECL 2 COMMON /VAPPRS/ALIQ,BLIQ,CLIQ,DLIQ,AICE,BICE,CICE,DICE,XLS0,XLS1 3 DATA ALIQ,BLIQ,CLIQ,DLIQ/613.3,17.502,4780.8,32.19/ 4 DATA AICE,BICE,CICE,DICE/613.2,22.452,6133.0,0.61/ 5 DATA XLS0,XLS1/2.905E6,259.532/ CFLIC END DECLARATIONS 6 RETURN 7 END % ------------------------------------------------------ The gist of this is that if the second file in the sort/merge is zero length, sort seg-faults. -John ---------------------------------------------------------------------- John Michalakes, michalak@ucar.edu, http://www.mcs.anl.gov/~michalak ---------------------------------------------------------------------- MCS Division | MMM Division Argonne National Laboratory | National Center for Atmospheric Research | 3450 Mitchell Lane, Boulder, CO 80301 | 303-497-8199 ----------------------------------------------------------------------
Hello, I am having trouble compiling the MPP version of MM5V3 under Sun MPI on a cluster of Sparc workstations. I am using the sunmpi portion of the configure.user file in section 7 without modification. Specifically, several object files are not built properly so the link fails. The compiler errors are: ld: fatal: file fkill_model.o: cannot open file: No such file or directory ld: fatal: file kfbmdata.o: cannot open file: No such file or directory ld: fatal: file mparrcopy.o: cannot open file: No such file or directory ld: fatal: file savread.o: cannot open file: No such file or directory ld: fatal: file write_flag.o: cannot open file: No such file or directory ld: fatal: File processing errors. No output written to mm5.mpp *** Error code 1 (ignored) Each of the associated files (like fkill_model.f) contains only a single line with a "Segmentation Fault (core dumped)" error, so it appears to me that the FLIC processing of these files is not proceeding properly. The FLIC processing and compilation of all the other files appears to proceed without problem. I am using Sun Workshop 6 Fortran and C compilers. I have also tried using the gnu version of make without success. If anybody has any ideas about what is causing this problem, I would appreciate hearing from them. Thanks Mike Walters Department of Engineering Physics Air Force Institute of Technology