-
Notifications
You must be signed in to change notification settings - Fork 315
Performance Notes
Keith ran a set of simulations with ctsm1.0.dev105 in July, 2020 to obtain cost and performance numbers. Results shown below.
Keith Oleson did an analysis in April, 2018 of cost and performance of CLM5 for various compsets and resolutions (1deg and 4x5). 3 ensembles of runs at 1deg and 1 ensemble of runs at 4x5 were conducted. Here are the results of that analysis.
The first table below shows Cost and LND run time for various configurations and resolutions at the default setting for NTASKS (60 for 1deg, 50 for 2deg, 16 for 4X5). One ensemble member for each.
The second table/plot shows Cost and TOT (total), ATM, LND run time for the BGC/Crop configuration at 1 deg for various NTASKS. Shown is the average results for three ensembles of runs.
The third table/plot shows Cost and TOT (total), ATM, LND run time for the BGC configuration at 4x5 for various NTASKS (one ensemble member).
Bill Sacks did an analysis November 12, 2015 of cost increases from CLM4 to what was then the out-of-the-box version of CLM5. Here are the results of that analysis:
In clm4_5_3_r149
, CLM4CN took 0.328 sec/mday, whereas the default CLM5
configuration - CLM50%BGC-CROP
with CISM1 (glc_mec) - took 4.002
sec/mday. This is an increase of 12.2x. This can be broken down as
follows:
(Timings are 'LND Run' times from 20-day runs with no output
(REST_OPTION=never
, but no other changes - so hbuf is still
activated), at f09_g16
with 600 tasks and 1 thread (except datm, which
had 30 tasks, with ROOTPE_ATM=600
), cold start 1850 runs. Unless
stated otherwise, timings were done from clm4_5_3_r149
.)
-
As of
clm4_0_60
(clm4.5 first brought to trunk), CLM45CN was 1.89x the cost of CLM4CN. (Note: For theclm4_0_60
run with CLM45CN, I used the surface dataset created inclm4_0_80
; earlier f09 surface datasets did not appear to be set up properly for CLM4.5.) -
The addition of the fire model (in
clm4_0_80
) made CLM45CN performance significantly worse, mainly due to the cost of reading the lightning stream. This seems to be the main factor responsible for the cost increase betweenclm4_0_60
and the cesm1.2.0 release (roughlyclm4_5_07
); however, it may not have been the sole factor. The cost increase for a CLM45CN run betweenclm4_0_60
and cesm1.2.0 was 1.45x. However, this is not an apples-to-apples comparison, because there were likely Machines and other changes between these two points. -
Adding the additional memory needed for dynamic landunits increased the cost by about 1.1x for non-crop runs (this was done in
clm4_5_43
).- Update: the 1.1x number was generated from a CLM45BGC run, I think, so may be an overestimate for CLM45CN
-
As of
clm4_5_3_r149
, CLM45CN is 2.70x the cost of CLM4CN. This is slightly less than you would get by multiplying the above numbers (1.89*1.45*1.1 = 3.01
). -
As of
clm4_5_3_r149
, CLM45BGC is a further 1.35x the cost of CLM45CN -
CLM50BGC is only 1.074x the cost of CLM45BGC, if using the old vertical soil layer structure
-
The new vertical soil layer structure (CLM5 default) increases the cost by 1.31x
-
Adding crop (
CLM50%BGC-CROP
relative toCLM50%BGC
) increases the cost by 2.27x. Much of this is due to 0-weight (inactive) crop columns, which are added for the sake of dynamic landunits. If you just allocate memory for non-zero-weight crop columns, the cost increase due to crop is more like 1.5x. -
Adding glc_mec increases the cost by 1.045x
-
From a first pass, there do not appear to have been other major contributors to the cost increase. In particular, it appears that the major refactorings done to CLM45 in the last couple of years have NOT had a significant performance impact. (However, my rough analysis could have missed changes, particularly if there were some increases compensated for by other decreases.)
Note that (by construction), multiplying the above factors, we get
2.70*1.35*1.074*1.31*2.27*1.045 = 12.2
- which is the total cost
increase from CLM4CN to the present default configuration.
(From Dave Lawrence 2015-11-11)
-
Lake model (6x more lake points, though some of these replace wetland: 2.5x more (lake+wetland))
-
Methane (new model, could it be optimized?)
-
Crop model (new capability will incur a cost, but could potentially be reduced)
-
Nearly 2x more urban columns
-
General
-
Documents
-
Bugs/Issues
-
Tutorials
-
Development guides
CTSM Users:
CTSM Developer Team
-
Meetings
-
Notes
-
Editing documentation (tech note, user's guide)