Testing/Benchmarking : SESAME Case
The data specified in these tables is for a small domain MM5 12-h
forecast (known as the SESAME squall). The values listed are the
elapsed time in seconds to complete the foreast (using the UNIX
timex utility). Several different Compaq architectures were
tested: DPW (the workstations), the 4100 pedestals and the 8400.
The current default compiler options as well as a more aggressive
set of compiler options were tested. Finally, the effect of using
the OpenMP option for parallel processing was tested. Virtually
no difference was found between f77 and f90, given the same compiler
flags on the same architecture.
The machines were not in single user mode, though the CPU utilization
was less than 1% throughout the testing. For this application, the
most sensitive factor was the architecture (single processor 4100
6-10% faster than the single processor DPW and 8400), followed by
compiler options (2-4% difference between the current default options
used in the model and the best set found so far), and finally OpenMP
(approximately 1% hit for OpenMP overhead when comparing single
processor runs).
The data in this table is for the no OpenMP set. Therefore, only
a single processor is run.
|
DPW
Default Compiler Options
(s)
|
DPW
Fast Compiler Options
(s)
|
4100
Default Compiler Options
(s)
|
4100
Fast Compiler Options
(s)
|
8400
Default Compiler Options
(s)
|
8400
Fast Compiler Options
(s)
|
406
|
398
|
383
|
369
|
418
|
401
|
Data for this table include the effects of OpenMP,
allowing multiple processor runs. Parallel speed-up is computed
by the computing the ratio of the single-processor elapsed time
to the multiple-processor elapsed time.
|
Number of Processors |
4100
Default Compiler Options
OpenMP
(s) |
4100
Fast Compiler Options
OpenMP
(s) |
8400
Default Compiler Options
OpenMP
(s) |
8400
Fast Compiler Options
OpenMP
(s) |
1 |
386 |
373 |
423 |
408 |
2 |
210 |
202 |
232 |
223 |
3 |
151 |
147 |
168 |
163 |
4 |
124 |
121 |
137 |
133 |
|