Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3pt] PR: New unit inundation and run benchmark tests tool #283

Merged
merged 54 commits into from
Mar 20, 2024

Conversation

RobHanna-NOAA
Copy link
Collaborator

@RobHanna-NOAA RobHanna-NOAA commented Feb 12, 2024

This is a new tool, named run_unit_benchmark_tests.py can use a single unit to:

  • pull down from S3 the exact HUC related benchmark files.
  • use the benchmark data and the geocurves from the unit to create inundation files.
  • using the new inundations files, the unit extent files, the benchmark extent files and the benchmark raster's to put through the GVAL engine to get benchmark test results and create a unit/version metrics file.
  • using the newly created metrics files, rolls them into a "environment" level (not unit and version) which has all metrics for all runs against it including earlier versions. A "environment" is either "PROD" or "DEV", and is designed to include all units and versions in the PROD or DEV respective environments. This allows any combinations of comparisons such as units by code version, a unit against it's own versions, a benchmark comparison across multiple units, etc.
    • Fields available in the environment metrics, allowing for various comparisons include:

      • unit_name: ie) 12040101
      • unit_version: ie) 230922
      • code_version: ie) v.1.29.0 or v2.0.1
      • huc,
      • benchmark_source and wide range of statics, magnitudes and other criteria.
    • Very Important Note: The new benchmark tool has no ability to retrieve a master copy of each environment metrics file. It is encouraged to keep a copy in S3 as a master copy, download it to the appropriate folder, ie) C:\ras2fim_data\gval\evaluations\PROD\eval_PROD_metrics.csv, before processing new benchmark data. The enviro metrics will be updated as each unit processed. At the end of processing one or multiple units benchmark data, you should save it back to S3 manually. Best to just sync the entire enviro folder recursively including the metrics file.

NOTE: Due to time constraint, only the minimum arguments for cmd line have been tested. More tests using non default arguments and combinations of args are needed. See Issue 294.

Some new symbology .lyr and .qml files have been added to make it easier to view to output agreement rasters and other files.

During testing, an unrelated bug was found relating to step 5 and failing HUCs due to boundary conditions. A temp fix is included adding a test to gracefully abort if it means the failing conditions.

There are a few other misc fixes and enhancements which include:

  • Addition of a stage_m column/data to output csv's as it was needed in the final geo_curves for inundation to work.
  • Another new tool is named s3_get_unit_inputs.py: With minimum arguments, a user can call the script with just a HUC number, script will go to S3 to download all of the required input files for ras2fim to run for that HUC. When relevant it will download HUC specific files only, such as dems\ras_3dep_HUC8_10m\HUC8_[]_dem.tif. A config file exists to determine which files to download and by default will only download them if the files do not already exist on the file system.
  • Some updates to the README.md and INSTALL.md for v2 notes have been included. Due to time constraints, not the documents are not fully updated.

Note: The environment.yml file has been updated, so please remove and recreate the ras2fim conda environment. See past PR details in the CHANGELOG.md for information about how to do this.

There are also some TODO's that are in many of the files in the tools directory related to this PR that need to be addressed. Some are higher priority than others.

Closes Issue 262 and Issue 273 and 312

Additions

  • git-ignore: Adjust to allow new symbology files.
  • config:
    • There are a number of new files have been for symbology in a subfolder of config.
    • s3_unit_download_files.lst: To support the new tool s3_get_unit_inputs.py mentioned above.
  • tools
    • run_unit_benchmark_tests.py: New tool as described above.
    • s3_get_unit_inputs.py: New tool as described above.

Files Renamed:

  • tools
    • evaluate_ras2fim_unit.py was renamed from evaluate_ras2fim_model.py which was incorrectly named. Updates were also made to function names and variable to talk about units and not models as models are data from RRASSLER. Other updates include enhanced tracing and error handling.

Removed:

  • `doc
    • conda_src.png: No longer relevant after changes to the README.md and INSTALL.md.

Changes

  • README.md: Text changes to reflect new V2 functionality.
  • config
    • source_codes.csv: Updated to included all 5 benchmark source; codes and descriptions.
  • doc
    • INSTALL.MD: Partial text changes to reflect new V2 functionality.
  • environment.yml: Updated to reflect package changes.
  • src
    • clip_dem_from_shape.py: Linting fix.
    • create_fim_rasters.py: Enhanced output/logging.
    • create_geocurves.py:
      • text and error handling updates
      • Fixes for various error conditions such as assuming that inundation files exist, changing the projection for the geometry column in the output geocurves csv files, boundary condition issues, and adjustments to multi-proc.
    • create_model_domain_polygons.py: linting fix.
    • create_rating_curves.py: linting fix, plus adding new stage_m column for ras2inundation.py script.
    • create_shapes_from_hecras.py: fix small multi-proc issue, text and tracing upgrades, upgrades to the bad_models_ls system. The earlier edition has the model folder time stamp in the name, which would have created problems if newer version of those models came out. Now, it ignores the timestamp.
    • ras2fim.py: Text changes based on new benchmark sources plus make a copy of the output log folder into the "final" folder at last minute for long term traceability.
    • shared_functions.py: Upgraded the parse_unit_folder_name function.
    • shared_variables.py: Updated a few variables and added new ones for the new tool. Small adjustments to some variable names. Added know acceptable benchmark magnitude / stages.
    • worker_fim_rasters.py: Upgraded error handling, text and logging.
  • tools
    • acquire_and_preprocess_3dep_dems.py: small changes for updated variable names from shared_variables.py.
    • ras2inundation.py: Added standard output headers, timers and durations. Updated some inline comments. Also upgraded a few bugs and error handling.
    • ras_unit_to_s3.py: Changes some terminology to talk use the new phrase of units, unit_names and unit_versions. Some variable names were changed as well for better readability.
    • s3_batch_evaluations.py: Added standard output headers, timers and durations. Updated some inline comments. Made some linting changes. Fixed a hardcoded output pathing issue which was referential the code file (it could use a bit more upgrading but works now). Some minor text and layout fixes. Adjustments were also made to handle differences between ble and nws benchmark sources. Note: At this point, it can not handle any other benchmark sources other than ble and nws.
    • s3_get_models.py: linting fix and added an logging output line.
    • s3_model_mgmt.py: found a better way to pass args into the "validation" functions. Also added some minor text and tracing fixes.
    • s3_search_tools.py: found a better way to pass args into the "validation" functions. Small s3_shared_funtion.py function name change.
    • s3_shared_functions.py: Added a few more arg fixes. Added some tqdm. Upgraded how files and folders are downloaded, found a problem with it for the new tool. Fixed a minor multi-proc bug.

Testing

A very large array of tests were done against most tools such as ras2fim.py and ras_unit_to_s3.py. New tools were heavily tested but due to time constraints, most testing was done based on minimum arguments and most defaults. The benchmark and inundation tools have been tested against v1 and v2 files. Although all r101 (v1) applicable units have been tested against the tool and uploaded results into S3.

image

image

I had a r101 folder downloaded from S3 and ran benchmark tests against it.
python ./tools/run_unit_benchmark_tests.py -u 12030105_2276_ble_230923 -e PROD

Checklist

You may update this checklist before and/or after creating the PR. If you're unsure about any of them, please ask, we're here to help! These items are what we are going to look for before merging your code.

  • Informative and human-readable title, using the format: [_pt] PR: <description>
  • Pre-commit executed and linting changes made.
  • Links are provided if this PR resolves an issue, or depends on another other PR
  • If submitting a PR to the dev branch (the default branch), you have a descriptive Feature Branch name using the format: dev-<description-of-change> (e.g.: dev-revise-levee-masking)
  • (sort off.. they are tightly related) - Changes are limited to a single goal (no scope creep)
  • The feature branch you're submitting as a PR is up to date (merged) with the latest dev branch
  • Changes adhere to PEP-8 Style Guidelines
  • (not yet, see issue card on it) Any change in functionality is tested
  • (More is required) New functions are documented (with a description, list of inputs, and expected output)
  • Placeholder code is flagged / future todos are captured in comments
  • (not quite done): Project documentation has been updated (CHANGELOG and/or README)
  • Reviewers requested

Merge Checklist (For Technical Lead use only)

  • Update CHANGELOG with latest version number and merge date

@RobHanna-NOAA RobHanna-NOAA added enhancement New feature or request ras2fim_V2 labels Feb 12, 2024
@RobHanna-NOAA RobHanna-NOAA self-assigned this Feb 12, 2024
@RobHanna-NOAA RobHanna-NOAA linked an issue Feb 12, 2024 that may be closed by this pull request
@RobHanna-NOAA RobHanna-NOAA linked an issue Mar 7, 2024 that may be closed by this pull request
@RobHanna-NOAA RobHanna-NOAA linked an issue Mar 8, 2024 that may be closed by this pull request
@RobHanna-NOAA RobHanna-NOAA marked this pull request as ready for review March 19, 2024 22:25
@CarsonPruitt-NOAA CarsonPruitt-NOAA merged commit 84e85ed into dev Mar 20, 2024
@CarsonPruitt-NOAA CarsonPruitt-NOAA deleted the dev-inundate-unit branch March 20, 2024 15:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request ras2fim_V2
Projects
None yet
2 participants