MONALISA: Modelling and Reconstruction of North Atlantic
Climate System Variability (Work package 1.1)
Thomas Stocker, Christoph C. Raible, Masakazu Yoshimori,
Neil Edwards, Urs Beyerle, & Manuel Renold

Climate and Environmental Physics (KUP)
Physics Institute, University of Bern, Sidlerstrasse 5, CH-3012 Bern, Switzerland

   i n t r o d u c t i o n
   g o a l s
   m e t h o d s
   s t a t u s
   r e s u l t s
   p u b l i c a t i o n s
   c o o p e r a t i o n s
   c o n t a c t    u s
   l i n k s

Status

  • Installation and performance testing of the Portable University Model of the Atmosphere (PUMA-2) on a local Linux-Cluster

  • Installation of Climate Community System Model (CCSM-Paleo, version 1.4) on an IBM Power-4 which contains 8 frames with 32 CPUs) in collaborations with specialists of the Swiss Centre of Scientific Computing (SCSC) in Manno.

  • Performance testing of the CCSM on the IBM.

  • Comparison of PUMA-2 and CCSM-Paleo 1.4.

  • Installation of CCSM on a local Linux-Cluster (max. 32 CPUs) and performance testing.

  • Building up a collection of diagnostic tools for oceanic and atmospheric analyses (internal use only!).

Performance of PUMA-2

Here, the influence of the number of CPUs, different computer architectures, and compilers are used to test model performance.

Computer Compiler CPUs avg. time per year Addionals
Compaq Workstation Compaq F90, version 5.3      
XP1000 (500MHz)   1 27min  
    2 33min 100Mbit network
Alphaserver 2100 (666Mhz)   1 24min  
    2 16min Dual-board
Linux PC (AMD1400) pgf90 1 23min  
Linux PC (AMD1800) pgf90 1 19min  
    2 12min Dual-board
Linux PC (AMD1800) Absoft F90 1 25min  
    2 15min Dual-board

Performance of CCSM-Paleo 1.4

Parameter used for testing are: total number of CPUs, optimization compiler options, number of CPUs per model component, and degree of parallelization (threading).

Tests with 8 CPUs: (cpl atm ocn ice lnd) = ( 1 4 1 1 1)

test name threads compile option avg. time per month
opt2.old 1 -O2,-qstrict 37 min.
opt3.old 1 -O3,-qstrict 42 min.
opt3.2.old 1 -O3 41 min.
opt2 4 -O2,-qstrict 12 min. (NCAR standard setup)
opt3 4 -O3,-qstrict 15 min.
opt3.2 4 -O3 15 min.

- "threading" is the most effective factor.
- optimization shows neglectible differences.


Tests with 16 CPUs: -O2,-qstrict

test name threads cpl atm ocn ice lnd avg. time per month
cpu16.1 4 1 8 4 1 2 14 min.
cpu16.2 8 1 8 4 1 2 9 min.
cpu16.9 12 1 8 4 1 2 stops after 9 months
cpu16.10 16 1 8 4 1 2 stops after 8 months
cpu16.3 4 1 8 4 2 1 13 min.
cpu16.12 8 1 8 4 2 1 stops after 7 months
cpu16.4 4 1 8 5 1 1 13 min.
cpu16.5 4 1 12 1 1 1 12 min.
cpu16.11 12 1 12 1 1 1 12 min. (but ranging 7-17 min.)
cpu16.6 4 1 10 3 1 1 atm = 10 CPUs never work
cpu16.7 4 1 6 7 1 1 16 min.
cpu16.8 4 1 4 9 1 1 16 min.
cpu16.13 8 2 8 2 2 2 9 min.
mrenold 6 1 6 6 2 1 16 min.
cpu16.14 6 1 6 6 1 2 15 min.

Comparison of PUMA-2 and CCSM-Paleo 1.4

CCSM-Paleo
PUMA-2
Advantages:

comprehensive model
low computational costs
coupled without flux-corrections

Disadvantages:

computationally expensive, but ~10 years per day is attainable
not yet coupled,   presumably with flux-correction
   
poor climatology (esp. precipition, NAO centers are displaced eastwards)