Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GF radar reflectivity update for RRFS realtime runs #1913

Closed
wants to merge 6 commits into from

Conversation

haiqinli
Copy link
Contributor

@haiqinli haiqinli commented Sep 21, 2023

PR Author Checklist:

  • [ V] I have linked PR's from all sub-components involved in section below.

  • [V ] I am confirming reviews are completed in ALL sub-component PR's.

  • [ V] I have run the full RT suite on either Hera/Cheyenne AND have attached the log to this PR below this line:

  • [V ] I have added the list of all failed regression tests to "Anticipated changes" section.

  • [V ] I have filled out all sections of the template.

Description

This is an urgent PR for RRFS_A and RRFS_B realtime runs. It includes the convective precipitation unit bug fix in GF radar reflectivity, soil moisture bug fix and updates for dust modules, and also the C3 convection updates.

Linked Issues and Pull Requests

Associated UFSWM Issue to close

Subcomponent Pull Requests

Blocking Dependencies

Subcomponents involved:

  • AQM
  • CDEPS
  • CICE
  • CMEPS
  • CMakeModules
  • [ V] FV3
  • GOCART
  • HYCOM
  • MOM6
  • NOAHMP
  • WW3
  • stochastic_physics
  • none

Anticipated Changes

Input data

  • [ V] No changes are expected to input data.
  • Changes are expected to input data:
    • New input data.
    • Updated input data.

Regression Tests:

  • No changes are expected to any regression test.
  • [ V] Changes are expected to the following tests:

Since the GF, dust and C3 schemes are updated, the results of the following RT cases are changed.

Tests effected by changes in this PR: conus13km_control_gnu conus13km_debug_intel hrrr_control_debug_dyn32_phy32_intel conus13km_control_intel conus13km_radar_tten_debug_gnu hrrr_control_debug_gnu conus13km_debug_2threads_gnu conus13km_radar_tten_debug_intel hrrr_control_debug_intel conus13km_debug_2threads_intel hrrr_c3_intel conus13km_debug_gnu hrrr_control_debug_dyn32_phy32_gnu

Libraries

  • [ V] Not Needed
  • Needed
    • Create separate issue in JCSDA/spack-stack asking for update to library. Include library name, library version.
    • Add issue link from JCSDA/spack-stack following this item
Code Managers Log
  • This PR is up-to-date with the top of all sub-component repositories except for those sub-components which are the subject of this PR.
  • Move new/updated input data on RDHPCS Hera and propagate input data changes to all supported systems.
    • N/A

Testing Log:

  • RDHPCS
    • [V ] Hera
    • Orion
    • Hercules
    • Jet
    • Gaea
    • Cheyenne
  • WCOSS2
    • Dogwood/Cactus
    • Acorn
  • CI
    • Completed
  • opnReqTest
    • N/A
    • Log attached to comment

@grantfirl
Copy link
Collaborator

@haiqinli Could you upload the file from logs/RegressionTests_hera.log to show that the expected tests fail? I don't think that the workflow.log file that you attached works for this purpose.

@haiqinli
Copy link
Contributor Author

@haiqinli Could you upload the file from logs/RegressionTests_hera.log to show that the expected tests fail? I don't think that the workflow.log file that you attached works for this purpose.

@grantfirl I made the regression test with this "nohup ./rt.sh -a acomp -r -k -l rt.conf". But the content of the RegressionTests_hera.log is as follows. Is there some updates of the command to run regression test? Thank you very much.

Wed Sep 20 02:41:45 UTC 2023
Start Regression test

Testing UFSWM Hash: f692e6a
Testing With Submodule Hashes:
37cbb7d6840ae7515a9a8f0dfd4d89461b3396d1 AQM (v0.2.0-37-g37cbb7d)
2aa6bfbb62ebeecd7da964b8074f6c3c41c7d1eb CDEPS-interface/CDEPS (cdeps0.4.17-38-g2aa6bfb)
2ed3c05c3c515eb70af3a726ff392283af97c4a5 CICE-interface/CICE (CICE6.0.0-442-g2ed3c05)
c24fb5999efafffaa393b886e21780ab7fd3aa08 CMEPS-interface/CMEPS (cmeps_v0.4.1-1404-gc24fb59)
cabd7753ae17f7bfcc6dad56daf10868aa51c3f4 CMakeModules (v1.0.0-28-gcabd775)
fbad8390f585744c716ac5167927c629aae6bebd FV3 (remotes/origin/develop-radar)
6ea78fd79037b31a1dcdd30d8a315f6558d963e4 GOCART (sdr_v2.1.2.6-106-g6ea78fd)
35789c757766e07f688b4c0c7c5229816f224b09 HYCOM-interface/HYCOM (2.3.00-121-g35789c7)
be40a41360b2eaed31ae86582aa57e1cf41241d5 MOM6-interface/MOM6 (dev/master/repository_split_2014.10.10-9801-gbe40a4136)
569e354ababbde7a7cd68647533769a5c966468d NOAHMP-interface/noahmp (v3.7.1-303-g569e354)
97e6a63ebf9a9030fcdae6ad5cf85a0bc91fa37f WW3 (6.07.1-342-g97e6a63e)
1ee7cc9a8b5d5733b391127ca31059b497ecdea8 stochastic_physics (ufs-v2.0.0-181-g1ee7cc9)

@haiqinli
Copy link
Contributor Author

@grantfirl I rerun the regression test, and got the RegressionTests_hera.log file with failed cases. Thanks.
RegressionTests_hera.log

@grantfirl
Copy link
Collaborator

@haiqinli

Thanks for uploading the file. It looks like several tests are failing to complete the run altogether. For example, for the hrrr_control_debug_intel test, if I check in its run directory and look at the err file, I see the following error:
141: An error occurred in ccpp_physics_run for group physics, block 1 and thread 1 (ntX= 1):
141: Detected size mismatch for variable GFS_Data(cdata%blk_no)%Sfcprop%smois in group physics before rrfs_smoke_wrapper_run, expected 3456 but got 288

I think this is related to your code changes, so I think that there is a bug that needs to be fixed.

@haiqinli
Copy link
Contributor Author

@grantfirl Thank you very much for looking into this. The rerun of the regression test with the dimension fix is done, and the log file is updated.
RegressionTests_hera.log

@grantfirl
Copy link
Collaborator

@haiqinli Thanks, it looks like it's running fine now and the tests are failing as expected.

@jkbk2004
Copy link
Collaborator

@grantfirl just want to confirm this PR is priority for RRFS parallel runs, right? We may schedule after merging #1910.

@grantfirl
Copy link
Collaborator

@grantfirl just want to confirm this PR is priority for RRFS parallel runs, right? We may schedule after merging #1910.

Yes, I think that this is the highest priority CCPP PR.

@FernandoAndrade-NOAA
Copy link
Collaborator

Hi @haiqinli, #1910 was merged in. Please go ahead and sync your weather model branch, all subcomponents, and resolve conflicts so we can begin testing. @BrianCurtis-NOAA @jkbk2004 the fv3 sub PR will need approvals if you can review and add any others to review as well.

@FernandoAndrade-NOAA FernandoAndrade-NOAA added the Baseline Updates Current baselines will be updated. label Sep 28, 2023
@jkbk2004
Copy link
Collaborator

@haiqinli @grantfirl can you sync up branch? so we can work on this PR.

@haiqinli
Copy link
Contributor Author

@jkbk2004 Sure, I will think my PR. Thank you very much!

@haiqinli
Copy link
Contributor Author

The weather model and branches have been synced. Thank you very much!

@grantfirl
Copy link
Collaborator

@jkbk2004 Can you reopen this? It looks like it was accidentally closed.

@FernandoAndrade-NOAA FernandoAndrade-NOAA added the Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked. label Sep 28, 2023
@FernandoAndrade-NOAA
Copy link
Collaborator

ORTs are being run manually on hera for now, jenkins ci will likely run out of space.

@FernandoAndrade-NOAA
Copy link
Collaborator

FernandoAndrade-NOAA commented Sep 28, 2023

@haiqinli

Thanks for uploading the file. It looks like several tests are failing to complete the run altogether. For example, for the hrrr_control_debug_intel test, if I check in its run directory and look at the err file, I see the following error: 141: An error occurred in ccpp_physics_run for group physics, block 1 and thread 1 (ntX= 1): 141: Detected size mismatch for variable GFS_Data(cdata%blk_no)%Sfcprop%smois in group physics before rrfs_smoke_wrapper_run, expected 3456 but got 288

I think this is related to your code changes, so I think that there is a bug that needs to be fixed.

@haiqinli Thanks, it looks like it's running fine now and the tests are failing as expected.

I'm running into this issue on Gaea for baseline recreation for the following available at
/lustre/f2/scratch/Fernando.Andrade-maldonado/FV3_RT/rt_4916/:
hrrr_control_debug_intel
hrrr_control_debug_dyn32_phy32_intel
conus13km_debug_intel
conus13km_radar_tten_debug_intel

@FernandoAndrade-NOAA
Copy link
Collaborator

Jet BL creation has also failed with similar errors in
/lfs4/HFIP/h-nems/Fernando.Andrade-maldonado/RT_RUNDIRS/Fernando.Andrade-maldonado/FV3_RT/rt_163280/

@haiqinli
Copy link
Contributor Author

@FernandoAndrade-NOAA I am so sorry that dimension fix was not committed properly. It is updated and should work now. Thank you very much for your help!

SamuelTrahanNOAA added a commit to SamuelTrahanNOAA/ufs-weather-model that referenced this pull request Sep 29, 2023
@jkbk2004
Copy link
Collaborator

As this pr was combined into #1893, let's hold the test on this branch. I think we should be able to merge #1893 by Monday. Sorry for all cross communications.

jkbk2004 pushed a commit that referenced this pull request Oct 3, 2023
…sics, and string length mismatch in dycore (plus PR #1913, #1917, and #1926) (#1893)

* GFDL_atmos_cubed_sphere: consistent string lengths in array

* stop FV3_HRRR_c3 from crashing with gnu debug

* 1hr forecast limit for conus13km_debug_qr

* fv3atm: bug fix from Dusan to recover_fields crash

* disable conus13km_debug_qr_gnu due to 25% failure rate on Hera

* FV3 dycore: initialize srf_wnd_var2 and tracers_var3 arrays

* enable conus13km_debug_qr_gnu

* Fix race condition in GFS_phys_time_vary.fv3.F90 error detection

* More bug fixes to GFS_phys_time_vary.fv3.F90:
1. detect empty errmsg from subroutines
2. Initialize err variables in set_soilveg.f, which is called from GFS_phys_time_vary.fv3.F90

* ccpp-physics: initialize errmsg & errflg in noahmp_tables.f90

* ccpp-physics: only read h2odata, ozdata and noahmp table when they are needed

* "point to the dimension fix of smc for dust emission"

* FV3: more dycore bug fixes from GFDL_atmos_cubed_sphere PR 285

* merge #1926

* merge GFDL_atmos_cubed_sphere #276

* bugfix: 12hr hrrr tests

* add GAEA rocoto support

* fv3: merge GF radar fixes

* check that baseline directory exists and is non-empty

---------

Co-authored-by: Haiqin.Li <[email protected]>
@jkbk2004
Copy link
Collaborator

jkbk2004 commented Oct 3, 2023

Merged with #1893

@jkbk2004 jkbk2004 closed this Oct 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Baseline Updates Current baselines will be updated. Ready for Commit Queue The PR is ready for the Commit Queue. All checkboxes in PR template have been checked.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants