Skip to content

Running ufs s2s model regression test using rt.sh

Jun Wang edited this page May 13, 2020 · 7 revisions

Minsuk Ji ([email protected]), Jun Wang, Dusan Jovic
google slide

Shell script-based Regression Test: rt.sh

  • Set of simple shell script files and input files
  • Also used by ufs-weather-model
  • Build + Run
  • Run only (workaround)
  • Supports Rocoto and ecFlow workflow managers
  • Currently supports Hera and Orion
  • ./rt.sh -f to run full regression test
  • ./rt.sh -c to create new baseline
  • Regression test root directory: ufs-s2s-model/tests

rt.sh related Files

rt.sh calls:

  • detect_machine.sh
  • compile.sh, compile.sh calls GNUmakefile
  • run_test.sh, run_test.sh calls rt_fv3.sh, rt_fv3.sh calls rt_utils.sh
  • run_tests.sh uses following input files:
    • rt.conf
    • default_vars.sh
    • <test-name>
    • <run-setup-name>

rt.sh related Files

  • detect_machine.sh: detect and assign machine name, set account (nems is default)
  • compile.sh: build a model using GNUmakefile (make app=*)
  • Run regression test
    • run_test.sh: sets environment variables, run directory, etc., and calls rt_fv3.sh
    • rt_fv3.sh: prepares a canned case in the run directory, and calls rt_utils functions
    • rt_utils.sh: contains utility functions, e.g.,
      • submit_and_wait
      • check_results
      • rocoto_create_compile_task, rocoto_create_run_task, rocoto_run, rocoto_kill
      • ecflow_create_compile_task, ecflow_create_run_task, ecflow_run, ecflow_kill

Input Files: 1) rt.conf

  • rt.conf: specify compile and run cases corresponding to ../compsets/all.input
  • COMPILE: specify <appBuilder-name> located in ../
  • RUN: specify <test-name> located in tests/
  • Each row is processed sequentially
  • Workflow managers: RUN depends on preceding COMPILE. Currently, only one COMPILE at a time (cf. ufs-weather-model)

Input Files: 2) tests/<test-name>

  • Two levels to set simulation parameters
    • default_vars.sh sets default values, similar to cpl_defaults in ../compsets/fv3mom6cice5.input
    • <test-name> overrides default values, adds test-specific parameters, e.g.,
      • SYEAR=2013, FHMAX=24, FDIAG=6, WLCLK=30
  • Set environment variables that are passed onto various template files in ../parm/
    • input.*.nml.IN
    • nems.configure.*.IN
    • model_configure.IN
  • Specify configuration templates to use, e.g.,
    • INPUT_NML=”input.mom6_ccpp.nml.IN”
    • NEMS_CONFIGURE=”nems.configure.med_atm_ocn_ice_wav.IN”
    • FV3_RUN=”cpld_fv3_mom6_cice_atm_flux_run.IN”

Input Files: 3) fv3_conf/<run-setup-name>

  • Set up input data, grid data, etc. by copying files from baseline directory to run directory
  • Baseline directory contains
    • Subdirectories for input data (e.g., CICE_IC, MOM6_IC, FV3_input_data, MEDIATOR_ccpp)
    • Subdirectories for previous run results (e.g., RT-Baselines_2d_warm_ccpp384)
  • Make sure directories and files exist in RTPWD

Default Directories specified in rt.sh

  • Baseline directory (RTPWD)
    • Hera: /scratch1/NCEPDEV/nems/emc.nemspara/RT/FV3-MOM6-CICE5/develop-YYYYMMDD
    • Orion: /work/noaa/stmp/jminsuk/RT/FV3-MOM6-CICE5/develop-20200504 (temporary)
  • Run directory root (RUNDIR_ROOT)
    • Hera: /scratch1/NCEPDEV/stmp2/${USER}/S2S_RT/rt_$$
    • Orion: /work/noaa/stmp/${USER}/stmp/${USER}/S2S_RT/rt_$$
    • RUNDIR=${RUNDIR_ROOT}/${TEST_NAME}
  • New baseline directory (NEW_BASELINE)
    • Hera: /scratch1/NCEPDEV/stmp4/${USER}/S2S_RT/REGRESSION_TEST_INTEL
    • Orion: /work/noaa/stmp/${USER}/stmp/${USER}/S2S_RT/REGRESSION_TEST_INTEL

Build

  • Triggered by COMPILE row in rt.conf with specified <appBuilder-name>
  • As in NEMSCompsetRun, build is done using GNUmakefile in ../NEMS/
  • compile.sh is a simple wrapper around GNUmakefile to interface with rt.sh
    • $ ./compile.sh coupledFV3_CCPP_MOM6_CICE
    • make app=coupledFV3_CCPP_MOM6_CICE build
  • If you prefer to build exe file separately (i.e., w/o rt.sh), place a copy in ufs-s2s-model/tests
    • $ cp ../NEMS/exe/NEMS.x fv3_mom6_cice_0.exe
    • $ cp ../NEMS/src/conf/modules.nems modules.fv3_mom6_cice_0
  • If you want to reuse your exe, keep a copy with a different name

rt.sh Usage

  • ./rt.sh: display usage information
  • ./rt.sh -c | -f | -l | -m | -k | -r | -e | -h
    • -c: create baseline
    • -f: use rt.conf
    • -l: use instead of rt.conf
    • -m: compare against new baseline results
    • -k: keep run directory
    • -r: use Rocoto workflow manager
    • -e: use ecFlow workflow manager
    • -h: display help (same as ./rt.sh)

Run Full Regression Tests

  • If you make code changes that are not expected to change simulation results, you can run full regression tests afterward to demonstrate your changes do not break anything
  • Currently, there are 14 standard regression tests on Hera and Orion
  • In ufs-s2s-model/tests/ directory, use any one of the following:
    • $ ./rt.sh -f >output 2>&1 &
    • $ ./rt.sh -f -e (use ecFlow)
    • $ ./rt.sh -fr (use Rocoto)
    • $ ./rt.sh -fek (use ecFlow, keep run directory for post-run diagnosis)

Run a Single Regression Test

  • Create a file, say my_test.conf, with a single COMPILE and a single RUN
    • $ cp rt.conf my_test.conf
    • $ vi my_test.conf
    • $ ./rt.sh -l my_test.conf
  • Or make a copy of original rt.conf file
    • $ cp rt.conf rt.conf.orig
    • $ vi rt.conf
    • $ ./rt.sh -f

Create a New Baseline of Existing Test

  • Your code changes are expected to change simulation results (e.g., physics change), and thus cannot be compared against existing baseline results
  • You still need RTPWD as it contains the simulation input data
  • ./rt.sh -c -f OR ./rt.sh -c -l my_test.conf
    • rt.sh will copy input data from RTPWD to NEW_BASELINE
    • Simulation results will be copied from RUNDIR to NEW_BASELINE
  • If warm start (i.e., requires mediator files generated by a cold run)
    • Run the corresponding cold run -- this will generate NEW_BASELINE/MEDIATOR_*/
    • Use a new directory for NEW_BASELINE
    • Change RTPWD to old NEW_BASELINE directory, which contains input and MEDIATOR_*
    • ./rt.sh -c
  • Manually move your NEW_BASELINE to emc.nemspara

Run Regression Test against New Baseline

  • You have generated new baseline
  • You want to compare all your subsequent tests against the new baseline
  • ./rt.sh -m -f OR ./rt.sh -m -l my_test.conf
  • Internally, -m flag sets RTPWD=${NEW_BASELINE}

Add a New Test

  • Configuration files (select, or copy and modify):
    • rt.conf
    • tests/<test-name>
    • tests/fv3_conf/
    • ../parm/input.*.nml.IN
    • ../parm/nems.configure.*.IN
    • ../parm/model_configure.IN
    • ../parm/ice_in_template
    • ../parm/MOM_input_template
  • ./rt.sh -c -l my_test.conf
    • Will not compare with existing baselines
  • If your case requires new input data not in RTPWD, set RTPWD to your local directory

Already have an Executable File

  • Remove COMPILE row in rt.conf
  • $ cp ../NEMS/exe/NEMS.x fv3_mom6_cice_0.exe
  • $ cp ../NEMS/src/conf/modules.nems modules.fv3_mom6_cice_0
    • This module file needs to be identical to the one you used for build
  • $ ./rt.sh -f
  • This approach does not work with workflow managers because RUN depends on COMPILE

Output Files and Log Files for Diagnosis

  • Summary files
    • Hera: RegressionTests_hera.intel.log, Compile_hera.intel.log
    • Orion: RegressionTests_orion.intel.log, Compile_orion.intel.log
    • MISSING file, MISSING baseline, OK, NOT OK...
  • ./rt.sh >output 2>&1 &: output of rt.sh
  • Log files in log_hera.intel/ and log_orion.intel/
    • compile_*.log: output of compile.sh and GNUmakefile
    • run_*.log: output of run_test.sh
  • Run directory RUNDIR_ROOT/
    • .log: output of rt_fv3.sh. If rocoto used, also contains err & out from sbatch job
    • subdir: contains all files necessary for simulation, e.g., sbatch job_card
    • QUEUE is set to batch in rt.sh
Clone this wiki locally