Thomas Stocker, Christoph C. Raible,
Masakazu Yoshimori, Neil Edwards, Urs Beyerle, & Manuel Renold
Climate and Environmental Physics (KUP)
Physics Institute, University of Bern, Sidlerstrasse 5, CH-3012 Bern, Switzerland
|
|
i n t r o d u c t i o n
g o a l s
m e t h o d s
s t a t u s
r e s u l t s
p u b l i c a t i o n
s
c o o p e r a t i o n s
c o n t a c t u s
l i n k s
|
Status
- Installation and performance testing of the Portable University
Model of the Atmosphere (PUMA-2) on a local Linux-Cluster
- Installation of Climate Community System Model (CCSM-Paleo, version 1.4) on an IBM Power-4
which contains 8 frames with 32 CPUs) in collaborations with
specialists of the Swiss Centre of
Scientific Computing (SCSC) in Manno.
- Performance testing of the CCSM on the IBM.
- Comparison of PUMA-2 and CCSM-Paleo 1.4.
-
Installation of CCSM on a local Linux-Cluster (max. 32 CPUs) and
performance testing.
- Building up a collection of diagnostic tools for oceanic and
atmospheric analyses (internal use only!).
Performance of PUMA-2
Here, the influence of the number of CPUs, different
computer architectures, and compilers are used to test model performance.
| Computer |
Compiler |
CPUs |
avg. time per year |
Addionals |
| Compaq Workstation |
Compaq F90, version 5.3 |
|
|
|
| XP1000 (500MHz) |
|
1 |
27min |
|
| |
|
2 |
33min |
100Mbit network |
| Alphaserver 2100 (666Mhz) |
|
1 |
24min |
|
| |
|
2 |
16min |
Dual-board |
| Linux PC (AMD1400) |
pgf90 |
1 |
23min |
|
| Linux PC (AMD1800) |
pgf90 |
1 |
19min |
|
| |
|
2 |
12min |
Dual-board |
| Linux PC (AMD1800) |
Absoft F90 |
1 |
25min |
|
| |
|
2 |
15min |
Dual-board |
Performance of CCSM-Paleo 1.4
Parameter used for testing are: total number of CPUs, optimization
compiler options, number of CPUs per model component, and degree
of parallelization (threading).
Tests with 8 CPUs: (cpl atm ocn ice lnd) = ( 1 4 1 1 1)
| test name |
threads |
compile option |
avg. time per month |
| opt2.old |
1 |
-O2,-qstrict |
37 min. |
| opt3.old |
1 |
-O3,-qstrict |
42 min. |
| opt3.2.old |
1 |
-O3 |
41 min. |
| opt2 |
4 |
-O2,-qstrict |
12 min. (NCAR standard setup) |
| opt3 |
4 |
-O3,-qstrict |
15 min. |
| opt3.2 |
4 |
-O3 |
15 min. |
- "threading" is the most effective factor.
- optimization shows neglectible differences.
Tests with 16 CPUs: -O2,-qstrict
| test name |
threads |
cpl |
atm |
ocn |
ice |
lnd |
avg. time per month |
| cpu16.1 |
4 |
1 |
8 |
4 |
1 |
2 |
14 min. |
| cpu16.2 |
8 |
1 |
8 |
4 |
1 |
2 |
9 min. |
| cpu16.9 |
12 |
1 |
8 |
4 |
1 |
2 |
stops after 9 months |
| cpu16.10 |
16 |
1 |
8 |
4 |
1 |
2 |
stops after 8 months |
| cpu16.3 |
4 |
1 |
8 |
4 |
2 |
1 |
13 min. |
| cpu16.12 |
8 |
1 |
8 |
4 |
2 |
1 |
stops after 7 months |
| cpu16.4 |
4 |
1 |
8 |
5 |
1 |
1 |
13 min. |
| cpu16.5 |
4 |
1 |
12 |
1 |
1 |
1 |
12 min. |
| cpu16.11 |
12 |
1 |
12 |
1 |
1 |
1 |
12 min. (but ranging 7-17 min.) |
| cpu16.6 |
4 |
1 |
10 |
3 |
1 |
1 |
atm = 10 CPUs never work |
| cpu16.7 |
4 |
1 |
6 |
7 |
1 |
1 |
16 min. |
| cpu16.8 |
4 |
1 |
4 |
9 |
1 |
1 |
16 min. |
| cpu16.13 |
8 |
2 |
8 |
2 |
2 |
2 |
9 min. |
| mrenold |
6 |
1 |
6 |
6 |
2 |
1 |
16 min. |
| cpu16.14 |
6 |
1 |
6 |
6 |
1 |
2 |
15 min. |
Comparison of PUMA-2 and CCSM-Paleo 1.4
CCSM-Paleo
|
PUMA-2
|
Advantages:
|
|
comprehensive model
|
low computational costs
|
coupled without flux-corrections
|
|
Disadvantages:
|
|
computationally expensive, but ~10
years per day is attainable
|
not yet coupled,
presumably with flux-correction |
|
poor climatology (esp. precipition,
NAO centers are displaced eastwards)
|
|