The information contained on the Parallel MM5 Benchmarks web page is computed in the following manner:
The model is run using the namelist, configure.user, and data sets provided in ftp://ftp.ucar.edu/mesouser/MM5V3/TESTDATA/mm5_t3a_bench.tar.gz . This specifies a 3 hour run, TIMAX = 180 with an 81 second time step. The calculation of the speed of the run is based on the average time per time step of the last of the 3 hours minus the cost of the very last time step. The raw timing information is contained in the file rsl.error.0000 produced by the model (this is standard-error output from first MPI task, normally task zero.
The following UNIX shell command will calculate this average:
mm5etime rsl.error.0000 | tail -45 | head -44 | awk -f stats.awk
where mm5etime is an executable csh script file containing:
#!/bin/csh set infile=$1 if ( $infile == "" ) set infile=rsl.error.0000 if ( ! -f $infile ) set infile=rsl.error.0000 grep '\*\*\* *[0-9]* *1 ' $infile | \ awk '{if(prev!=0)print $5-prev;prev=$5}'
and stats.awk is an awk script containing:
BEGIN{ a = 0.0 ; i = 0 ; max = -999999999 ; min = 9999999999 } { i ++ a += $1 if ( $1 > max ) max = $1 if ( $1 < min ) min = $1 } END{ printf("---\n%10s %8d\n%10s %15f\n%10s %15f\n%10s %15f\n%10s %15f\n%10s %15f\n","items:",i,"max:",max,"min:",min,"sum:",a,"mean:",a/(i*1.0),"mean/max:",(a/(i*1.0))/max) }
The output of the command will look something like this:
items: 44 max: 375.000000 min: 144.000000 sum: 6879.000000 mean: 156.340909 mean/max: 0.416909
and the number of interest, the average time per timestep, is the mean value. Dividing 2,398 million operations per average time step, the estimated operation count for the T3A case used to calculate Mflop/second in the Parallel MM5 Benchmarks, by the mean time per time step gives the number of floating point operations per second.
Rotang