-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAMxx variables #880
base: cdat-migration-fy24
Are you sure you want to change the base?
EAMxx variables #880
Conversation
d5a1aad
to
7550b3d
Compare
2c55d9e
to
469cebb
Compare
ds_climo = climo(ds, self.var, season).to_dataset() | ||
ds_climo = ds_climo.bounds.add_missing_bounds(axes=["X", "Y"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should close #884. Please verify the fix, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, this fixed the problem! Thank you!
@tomvothecoder I completed 2D and 3D variables derivation, excepted for COSP related output. e3sm_diags/e3sm_diags/driver/__init__.py Lines 7 to 12 in 43d2df7
I'm now sure how to provide a clean way to accommodate lower case variable names.. |
I just pushed 46a5dad to add support for more land/ocean var keys. Let me know your thoughts. |
Pushed |
@@ -243,7 +242,7 @@ def _apply_land_sea_mask( | |||
ds_new = ds.copy() | |||
ds_new = _drop_unused_ilev_axis(ds) | |||
output_grid = ds_new.regridder.grid | |||
mask_var_key = _get_region_mask_var_key(region) | |||
mask_var_key = _get_region_mask_var_key(ds_mask, region) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ran into error below for a special usecase when the region is land_60S90N
.
raise ValueError(f"Only land and ocean regions are supported, not '{region}'.")
ValueError: Only land and ocean regions are supported, not 'land_60S90N'.
It looks like we should have a way to distinguish land/ocean mask only and the other type of special regions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commit 542b88b should address this.
ds_sub = self._subset_vars_and_load(ds, var) | ||
|
||
time_slice = self._get_time_slice(ds_sub) | ||
ds_sub = ds_sub.sel(time=time_slice).squeeze() | ||
time_slice = self._get_time_slice(ds) | ||
ds_sub = ds.sel(time=time_slice).squeeze() | ||
|
||
if self.is_sub_monthly: | ||
ds_sub = self._exclude_sub_monthly_coord_spanning_year(ds_sub) | ||
|
||
ds_sub = self._subset_vars_and_load(ds_sub, var) | ||
|
||
return ds_sub |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should close #892. I was able to run the U
variable. Can you please confirm the fix too @chengzhuzhang?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My output:
2024-11-08 12:52:52,411 [INFO]: e3sm_diags_driver.py(_save_env_yml:58) >> Saved environment yml file to: [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/environment.yml](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/environment.yml)
2024-11-08 12:52:52,413 [INFO]: e3sm_diags_driver.py(_save_parameter_files:69) >> Saved command used to: [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/cmd_used.txt](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/cmd_used.txt)
2024-11-08 12:52:52,414 [INFO]: e3sm_diags_driver.py(_save_python_script:133) >> Saved Python script to: [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/ipykernel_launcher.py](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/ipykernel_launcher.py)
2024-11-08 12:52:52,976 [INFO]: lat_lon_driver.py(run_diag:69) >> Variable: U
2024-11-08 12:53:12,435 [INFO]: lat_lon_driver.py(_run_diags_3d:396) >> Selected pressure level(s): [850.0]
2024-11-08 12:53:13,554 [INFO]: regrid.py(subset_and_align_datasets:70) >> Selected region: global
2024-11-08 12:53:21,676 [INFO]: io.py(_save_data_metrics_and_plots:77) >> Metrics saved in [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.json](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.json)
2024-11-08 12:53:28,892 [INFO]: utils.py(_save_plot:91) >> Plot saved in: [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png)
2024-11-08 12:53:28,892 [INFO]: utils.py(_save_plot:91) >> Plot saved in: [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png)
2024-11-08 12:53:28,895 [INFO]: main.py(create_viewer:132) >> lat_lon [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/viewer](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/viewer)
2024-11-08 12:53:32,950 [INFO]: main.py(create_viewer:135) >> [('Latitude-Longitude contour maps', 'lat_lon/index.html'), ('Table', 'table/index.html'), ('Taylor Diagram', 'taylor/index.html'), ('CMIP6 Comparison', 'cmip6/index.html')]
2024-11-08 12:53:32,956 [INFO]: e3sm_diags_driver.py(main:392) >> Viewer HTML generated at [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/viewer/index.html](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/viewer/index.html)
2024-11-08 12:53:32,976 [INFO]: logger.py(move_log_to_prov_dir:106) >> Log file saved in [/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/e3sm_diags_run.log](https://vscode-remote+ssh-002dremote-002bperlmutter.vscode-resource.vscode-cdn.net/global/cfs/cdirs/e3sm/www/cdat-migration-fy24/eamxx_decadal_1996_1031_edv3/prov/e3sm_diags_run.log)
I'm getting a separate error about circular imports. Do you see this too? Not sure how this was introduced, but it should be addressed: 2024-11-08 12:27:54,820 [ERROR]: core_parameter.py(_run_diag:343) >> Error in e3sm_diags.driver.lat_lon_driver
Traceback (most recent call last):
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/parameter/core_parameter.py", line 340, in _run_diag
single_result = module.run_diag(self)
AttributeError: partially initialized module 'e3sm_diags.driver.lat_lon_driver' has no attribute 'run_diag' (most likely due to a circular import)
2024-11-08 12:27:54,821 [ERROR]: run.py(run_diags:91) >> Error traceback:
Traceback (most recent call last):
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/run.py", line 89, in run_diags
params_results = main(params)
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/e3sm_diags_driver.py", line 373, in main
parameters_results = _run_serially(parameters)
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/e3sm_diags_driver.py", line 271, in _run_serially
nested_results.append(parameter._run_diag())
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/parameter/core_parameter.py", line 333, in _run_diag
module = importlib.import_module(mod_str)
File "/global/u2/v/vo13/mambaforge/envs/e3sm_diags_dev_892/lib/python3.10/importlib/__init__.py", line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "<frozen importlib._bootstrap>", line 1050, in _gcd_import
File "<frozen importlib._bootstrap>", line 1027, in _find_and_load
File "<frozen importlib._bootstrap>", line 1006, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 688, in _load_unlocked
File "<frozen importlib._bootstrap_external>", line 883, in exec_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
File "/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/driver/zonal_mean_xy_driver.py", line 17, in <module>
from e3sm_diags.metrics.metrics import spatial_avg
ImportError: cannot import name 'spatial_avg' from partially initialized module 'e3sm_diags.metrics.metrics' (most likely due to a circular import) (/global/u2/v/vo13/E3SM-Project/e3sm_diags/e3sm_diags/metrics/metrics.py)
2024-11-08 12:27:54,824 [INFO]: logger.py(move_log_to_prov_dir:106) >> Log file saved in /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/877-attr-err/eamxx_decadal_1996_1107_edv3/prov/e3sm_diags_run.log
2024-11-08 12:27:55,985 [INFO]: lat_lon_driver.py(run_diag:69) >> Variable: U
Value(False)
2024-11-08 12:32:27,811 [INFO]: lat_lon_driver.py(_run_diags_3d:396) >> Selected pressure level(s): [850.0]
2024-11-08 12:32:29,678 [INFO]: regrid.py(subset_and_align_datasets:70) >> Selected region: global
2024-11-08 12:32:39,801 [INFO]: io.py(_save_data_metrics_and_plots:77) >> Metrics saved in /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/877-attr-err/eamxx_decadal_1996_1107_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.json
2024-11-08 12:32:54,463 [INFO]: utils.py(_save_plot:91) >> Plot saved in: /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/877-attr-err/eamxx_decadal_1996_1107_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png
2024-11-08 12:32:54,463 [INFO]: utils.py(_save_plot:91) >> Plot saved in: /global/cfs/cdirs/e3sm/www/cdat-migration-fy24/877-attr-err/eamxx_decadal_1996_1107_edv3/lat_lon/ERA5/ERA5-U-850-ANN-global.png |
No longer appearing after running |
Commit #892 worked well! It took about 1 min to finish the 3d variable |
I found another issue that derived variable is not working for time-series files. Example .cfg
The input files |
You may need to step-through the loop that attempts to derive
Possible issueThe (e3sm_diags_dev_892) vo13@login08:.../Atm/time-series$ pwd
/global/cfs/cdirs/e3sm/diagnostics/observations/Atm/time-series
(e3sm_diags_dev_892) vo13@login08:.../Atm/time-series$ ls ERA5_ext
ls: cannot access 'ERA5_ext': No such file or directory
(e3sm_diags_dev_892) vo13@login08:.../Atm/time-series$ |
@tomvothecoder thank you for looking into this! I'm actually stepping into the code, and it does look like the logic should be correct. Yes, the files are mis-placed to |
I can confirm that this is a data problem. The problem described in #880 (comment) is resolved with time-series files placed in correct directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @chengzhuzhang, this PR looks good to me. I had some minor comments/questions. Thanks for working on this PR!
("precip_liq_surf_mass_flux", "precip_ice_surf_mass_flux"): prect, # EAMxx | ||
("precip_total_surf_mass_flux",): lambda pr: convert_units( | ||
rename(pr), target_units="mm/day" | ||
), # EAMxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we considering separating EAMxx variables out to another derivations dictionary so we don't need to comment # EAMxx
?
This can be done through #716.
"U10": {("sfcWind",): rename}, | ||
"U10": { | ||
("sfcWind",): rename, | ||
("wind_speed_10m",): rename, # EAMxx ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there supposed to be a ?
in the comment here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crterai could you confirm is wind_speed_10m
is equivalent to U10 from EAM?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, wind_speed_10m
is the 10m wind speed, which I'm guessing U10
to represent
}, | ||
"TGCLDCWP": { | ||
("clwvi",): rename, | ||
("LiqWaterPath",): rename, # EAMxx Check if rain water is inlcuded? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the question Check if rain water is included?
still apply?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@crterai could you confirm if LiqWaterPath
and TGCLDCWP are equivalent?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, they're equivalent
time_slice = self._get_time_slice(ds) | ||
ds_sub = ds.sel(time=time_slice).squeeze() | ||
|
||
if self.is_sub_monthly: | ||
ds_sub = self._exclude_sub_monthly_coord_spanning_year(ds_sub) | ||
|
||
ds_sub = self._subset_vars_and_load(ds_sub, var) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just noting for future reference that we now time slice first then .load()
the dataset in order to prevent memory allocation issues (e.g., dataset too large to load into memory before slicing).
I think this branch needs to be rebased on the latest |
Thank you for the review and rebasing @tomvothecoder. I will tag EAMxx developers for a review before merging. |
Refer to the PR for more information because the changelog is massive. Update build workflow to run on `cdat-migration-fy24` branch CDAT Migration Phase 2: Add CDAT regression test notebook template and fix GH Actions build (#743) - Add Makefile for quick access to multiple Python-based commands such as linting, testing, cleaning up cache and build files - Fix some lingering unit tests failure - Update `xcdat=0.6.0rc1` to `xcdat >=0.6.0` in `ci.yml`, `dev.yml` and `dev-nompi.yml` - Add `xskillscore` to `ci.yml` - Fix `pre-commit` issues CDAT Migration Phase 2: Regression testing for `lat_lon`, `lat_lon_land`, and `lat_lon_river` (#744) - Add Makefile that simplifies common development commands (building and installing, testing, etc.) - Write unit tests to cover all new code for utility functions - `dataset_xr.py`, `metrics.py`, `climo_xr.py`, `io.py`, `regrid.py` - Metrics comparison for `cdat-migration-fy24` `lat_lon` and `main` branch of `lat_lon` -- `NET_FLUX_SRF` and `RESTOM` have the highest spatial average diffs - Test run with 3D variables (`_run_3d_diags()`) - Fix Python 3.9 bug with using pipe command to represent Union -- doesn't work with `from __future__ import annotations` still - Fix subsetting syntax bug using ilev - Fix regridding bug where a single plev is passed and xCDAT does not allow generating bounds for coordinates of len <= 1 -- add conditional that just ignores adding new bounds for regridded output datasets, fix related tests - Fix accidentally calling save plots and metrics twice in `_get_metrics_by_region()` - Fix failing integration tests pass in CI/CD - Refactor `test_diags.py` -- replace unittest with pytest - Refactor `test_all_sets.py` -- replace unittest with pytest - Test climatology datasets -- tested with 3d variables using `test_all_sets.py` CDAT Migration Phase 2: Refactor utilities and CoreParameter methods for reusability across diagnostic sets (#746) - Move driver type annotations to `type_annotations.py` - Move `lat_lon_driver._save_data_metrics_and_plots()` to `io.py` - Update `_save_data_metrics_and_plots` args to accept `plot_func` callable - Update `metrics.spatial_avg` to return an optionally `xr.DataArray` with `as_list=False` - Move `parameter` arg to the top in `lat_lon_plot.plot` - Move `_set_param_output_attrs` and `_set_name_yr_attrs` from `lat_lon_driver` to `CoreParameter` class
…_stratosphere()` sets (#774)
- The time slice should occur before loading the dataset into memory, otherwise the entire time series will be loaded which can be a large dataset
47d0a0a
to
d94a66d
Compare
Just rebased, should be good to go for further review. |
Hi @PeterCaldwell @brhillman @crterai @AaronDonahue: This PR added support for all EAMxx output variables (except for those COSP related). This takes a little longer because, this update is based on the brand new e3sm_diags that is just migrated over to use xarray/xcdat to replace cdat (kudos to @tomvothecoder). Here is an example e3sm_diags run based on the 1996ish EAMxx decadal run that Ben provided. The workflow to generate this run is to first run ncclimo to generate the regridded climatology file ; and then run the e3sm_diags run script. Example of the nco script as below. Thanks to @czender, the two nco steps listed below can be simplified with just using one
It would be great if you could review the code change to verify the variable derivations are correct. Next we will work on better support on arbitrary length runs; COSP histogram and other variability sets (if given longer simulation). Any feedback on capabilities and priorities are welcome! Thanks, |
@@ -693,14 +821,21 @@ | |||
"RELHUM": { | |||
("hur",): lambda hur: convert_units(hur, target_units="%"), | |||
("RELHUM",): lambda relhum: convert_units(relhum, target_units="%"), | |||
("RelativeHumidity",): lambda relhum: convert_units( | |||
relhum, target_units="%" | |||
), # EAMxx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does "target units" imply it will convert the current units to %
? Note that in EAMxx RelativeHumidity has units of fraction so is bound from 0 to 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for reviewing! Yes, the convert_units
will convert from fraction to %
. I tested the variable at 850mb and can confirm that the derivation works as intended, see example here.
Description
This enhancement will be merged in a new e3sm_diags code base after cdat-migration-fy24 branch is merged.
Reference:
The mapping of new variables is based on https://acme-climate.atlassian.net/wiki/spaces/EAMXX/pages/4535976058/Output+Standard+Names put together by @AaronDonahue
The decadal output outlined by @brhillman: https://github.com/E3SM-Project/eamxx-scripts/pull/180/files#diff-1646ba1e37781387625d2ce585aad9ef7f5b6407616300838c7aecd44c67df7e
KeyError
for longitude bounds #884Checklist
If applicable: