Skip to content

Performance Notes

Bill Sacks edited this page Jan 25, 2018 · 10 revisions

Cost increases from CLM4 to CLM5

Bill Sacks did an analysis November 12, 2015 of cost increases from CLM4 to what was then the out-of-the-box version of CLM5. Here are the results of that analysis:

In clm4_5_3_r149, CLM4CN took 0.328 sec/mday, whereas the default CLM5 configuration - CLM50%BGC-CROP with CISM1 (glc_mec) - took 4.002 sec/mday. This is an increase of 12.2x. This can be broken down as follows:

(Timings are 'LND Run' times from 20-day runs with no output (REST_OPTION=never, but no other changes - so hbuf is still activated), at f09_g16 with 600 tasks and 1 thread (except datm, which had 30 tasks, with ROOTPE_ATM=600), cold start 1850 runs. Unless stated otherwise, timings were done from clm4_5_3_r149.)

  • As of clm4_0_60 (clm4.5 first brought to trunk), CLM45CN was 1.89x the cost of CLM4CN. (Note: For the clm4_0_60 run with CLM45CN, I used the surface dataset created in clm4_0_80; earlier f09 surface datasets did not appear to be set up properly for CLM4.5.)

  • The addition of the fire model (in clm4_0_80) made CLM45CN performance significantly worse, mainly due to the cost of reading the lightning stream. This seems to be the main factor responsible for the cost increase between clm4_0_60 and the cesm1.2.0 release (roughly clm4_5_07); however, it may not have been the sole factor. The cost increase for a CLM45CN run between clm4_0_60 and cesm1.2.0 was 1.45x. However, this is not an apples-to-apples comparison, because there were likely Machines and other changes between these two points.

  • Adding the additional memory needed for dynamic landunits increased the cost by about 1.1x for non-crop runs (this was done in clm4_5_43).

    • Update: the 1.1x number was generated from a CLM45BGC run, I think, so may be an overestimate for CLM45CN
  • As of clm4_5_3_r149, CLM45CN is 2.70x the cost of CLM4CN. This is slightly less than you would get by multiplying the above numbers (1.89*1.45*1.1 = 3.01).

  • As of clm4_5_3_r149, CLM45BGC is a further 1.35x the cost of CLM45CN

  • CLM50BGC is only 1.074x the cost of CLM45BGC, if using the old vertical soil layer structure

  • The new vertical soil layer structure (CLM5 default) increases the cost by 1.31x

  • Adding crop (CLM50%BGC-CROP relative to CLM50%BGC) increases the cost by 2.27x. Much of this is due to 0-weight (inactive) crop columns, which are added for the sake of dynamic landunits. If you just allocate memory for non-zero-weight crop columns, the cost increase due to crop is more like 1.5x.

  • Adding glc_mec increases the cost by 1.045x

  • From a first pass, there do not appear to have been other major contributors to the cost increase. In particular, it appears that the major refactorings done to CLM45 in the last couple of years have NOT had a significant performance impact. (However, my rough analysis could have missed changes, particularly if there were some increases compensated for by other decreases.)

Note that (by construction), multiplying the above factors, we get 2.70*1.35*1.074*1.31*2.27*1.045 = 12.2 - which is the total cost increase from CLM4CN to the present default configuration.

Clone this wiki locally