Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RC for MACAV3 #17

Closed
khegewisch opened this issue Jan 28, 2025 · 16 comments · Fixed by #35
Closed

RC for MACAV3 #17

khegewisch opened this issue Jan 28, 2025 · 16 comments · Fixed by #35

Comments

@khegewisch
Copy link

If you are registering content (RC) for DRCDP, please fill out the requested information below. If you want to create an issue about something else, please delete the text below and title your issue as appropriate.

To register (or edit) some or all of the DRCDP RC, please title this github issue as follows:
"RC for " + your source_name (as you define below), and indicate if you are modifying your input from an earlier issue


The following are required registered content (with example content for each item in bold). Please replace the example text below with your information to the right of the equal sign (DO NOT MAKE ANY CHANGES TO THE LEFT HAND SIDE OF THE EQUAL SIGN):

  1. downscaling_source_id = 'MACAV3'
  2. institution_id = 'UCM-ACSL'
  3. region = 'NAM'
  4. A list of CMIP variable_ids that the above information refers to. In most cases it will only be for one variable_id. If it is for more than one, please make sure your source_description is sufficiently general to apply to all relevant variable_ids.

I'm not sure that I understand here.
variable_ids = 'tasmax', 'tasmin', 'pr', 'hursmin', 'hursmax', 'sfcWind', 'rsds',

We also have 'wind_direction' and 'dew_point_temperature'. Alternatively to 'wind_direction', we could supply 'uas', 'vas'.

The above RC is a first step to ensuring DRCDP data can be identified and retrieved from ESGF. As dataset is being prepared some "Controlled Vocabulary (CV) will also be required. Examples include "CMIP_experiment", "realization", "frequency" and "variable"


@durack1
Copy link
Contributor

durack1 commented Feb 17, 2025

@khegewisch just circling on this, some work has gone into standardizing the inputs across data sources (source_id entries), see below (or follow the link to see this registered content in relation to other registered data):

"MACA3-0":{
"calendar":"gregorian",
"contact":"John T. Abatzoglou; [email protected]",
"further_info_url":"https://www.climatologylab.org/maca.html",
"grid":"10 x 10 km latitude x longitude",
"grid_label":"gn",
"institution_id":"UCM-SNRI",
"license":"CC0 1.0",
"nominal_resolution":"10 km",
"product":"downscaled-statistical",
"reference":"Abatzoglou, John T., and Timothy J. Brown (2012) A comparison of statistical downscaling methods suited for wildfire applications. International Journal of Climatology, 32 (5), pp 772-780. https://doi.org/10.1002/joc.2312",
"region":"north_america",
"region_id":"NAM",
"source":"MACA 3.0: Statistically-downscaled climate model projections based on CMIP6",
"source_name":"MACA",
"source_version_number":"3.0",
"title":"MACA 3.0 dataset prepared for DRCDP"
},

And while we're circling, here is the institution_id registration info - Note I followed the institutional info, rather than the UCM-ACSL you have listed above, which I could not find a URL or UCM reference for:

"UCM-SNRI":{
"ROR":"00d9ah105",
"URL":"https://www.climatologylab.org",
"contact":"John T. Abatzoglou; [email protected]",
"name":"Sierra Nevada Research Institute, University of California, Merced, 5200 N. Lake Road, Merced, CA 95343, USA"
},

It'd be great to get feedback on this, so we can tweak and finalize the registered info

@khegewisch
Copy link
Author

khegewisch commented Feb 18, 2025

@durack1 I think this all looks good. Thank you!

Actually I have some questions:

  1. The sample LOCA file I'm looking at has:
    institution = "University of California, San Diego, Scripps Institution of Oceanography"
    This isn't in these descriptors. I would guess ours would be:
    institution = "University of California, Merced, Sierra Nevada Research Institute"

  2. The sample LOCA file I'm looking at has
    source = "nClimGrid-Monthly-1-0 1.0 (2014): NOAA nClimGrid-Monthly" ;
    source_id = "LOCA2--GFDL-CM4" ;
    source_type = "gridded_insitu" ;

It looks like the source is the training dataset for the downscaling here.
Should that be used as the source or do you want me to use:
"source":"MACA 3.0: Statistically-downscaled climate model projections based on CMIP6",
"source_name":"MACA",
source_id = "MACA--GFDL-CM4" ;

@durack1
Copy link
Contributor

durack1 commented Feb 19, 2025

@khegewisch thanks for the feedback.

RE: institution. I always think of the primary org, then the host org. So for us, PCMDI hosted at LLNL, and Scripps Institution of Oceanography hosted by UCSD, the inverse of what you have noted above and that is why I registered SNRI hosted by UC Merced "Sierra Nevada Research Institute, University of California, Merced, 5200 N. Lake Road, Merced, CA 95343, USA". I can invert if you like, but would be good to keep the logic consistent if we can, I am assuming we're going to capture numerous entries which follow similar orgs. As we are minting a brand new institution_id, I'd be happy to create a SNRI-UCM, or ACSL-UCM if that sits better with you?

RE "source" identiers, with the cleanup there has been the concept of the source_id (which for the DRCDP project is the downscaling method, e.g. MACA3-0), and the driving dataset being the CMIP6 model simulation that is being downscaled - this means for the inputs required to CMORize data, you have the registered information for e.g. MACA3-0 (above), with the "driving*" info provided in the user_input.json file at run time - the working example is DataPreparationExamples/DEMO/DRCDP-LOCA2-1-demo_user_input.json.

And one last update, the CRS info, we're planning on making a change to CMOR to account for this, chatter is being captured in PCMDI/cmor#774. There are some changes currently being implemented for CMIP7 preparation and a CMOR 3.10 release, with the CRS a target for a subsequent 3.11 release - these changes are being done in parallel with the build out of CMIP7 infrastructure which is in progress - many moving parts

@khegewisch
Copy link
Author

@durack1 I'm fine with whatever here. Consistent with your idea of the institution, it seems the institution_id should be SNRI-UCM here. I haven't talked to John about this, but our research group is ACSL - Applied Climate Science Laboratory - and that seems more appropriate than listing SNRI here. Technically John's affiliation is not SNRI, mine is. But either probably would work.

institution ='Applied Climate Science Laboratory, University of California, Merced, 5200 N. Lake Road, Merced, CA 95343, USA'
institution_id = 'ACSL-UCM'

@durack1
Copy link
Contributor

durack1 commented Feb 20, 2025

@khegewisch if ACSL-UCM makes sense to you, that works for me. As a backstory, for some of the other contributors (not to this project, but obs4MIPs for e.g.), we have numerous NASA centers that contribute data, so then this breaks out to NASA-GFSC, NASA-JPL and NASA-LaRC. I am doubtful that we're going to get a lot of UCM- contributions to this project, but ACSL-UCM inverts the NASA- logic as an FYI. Other UC contributors to obs4MIPs include UCI-CHRS (Irvine, Center for Hydrometeorology and Remote Sensing) and UCSD-SIO (Scripps) for full gore, see PCMDI/obs4MIPs-cmor-tables/obs4MIPs_institution_id.json

We also attempt to register institutions or organizations with their ROR (Research Organization Registry) identifier. There is a registration for UC Merced (https://ror.org/00d9ah105), but I wasn't able to find anything for ACSL or SNRI for that matter, no problem either way, but just thought I'd check before finalizing such registered details

@khegewisch
Copy link
Author

@durack1 I have never heard of an ROR. If this needs to be one that has an ROR, let's just call it just UC Merced then. SNRI is a research institute. ACSL is just a research group. I can imagine that both don't have any registrations for research.

@durack1
Copy link
Contributor

durack1 commented Feb 20, 2025

@khegewisch It's about as closely matching entities as we can, so we can go with UCM-ACSL and use the UC Merced ROR, which is the closest ROR registry that fits. With LLNL, I've had to request a PCMDI ROR to pin a program within an organization, and it's yet to be approved.

On the topic of variables, are all the entries you note above 'hursmin', 'hursmax', 'sfcWind', 'rsds', these are all daily data, right? I just need to make sure that I am matching up temporal frequency with quantities which make the variable identity

@khegewisch
Copy link
Author

Great. UCM-ACSL works.

Yes, all our data is daily data.

@durack1
Copy link
Contributor

durack1 commented Feb 21, 2025

Ok @khegewisch just updated the registered info as we discussed above, see

"UCM-ACSL":{
"ROR":"00d9ah105",
"URL":"https://www.climatologylab.org",
"contact":"John T. Abatzoglou; [email protected]",
"name":"Applied Climate Science Laboratory, University of California, Merced, 5200 N. Lake Road, Merced, CA 95343, USA"
},

and

"MACA3-0":{
"calendar":"gregorian",
"contact":"John T. Abatzoglou; [email protected]",
"further_info_url":"https://www.climatologylab.org/maca.html",
"grid":"10 x 10 km latitude x longitude",
"grid_label":"gn",
"institution_id":"UCM-ACSL",
"license":"CC0 1.0",
"nominal_resolution":"10 km",
"product":"downscaled-statistical",
"reference":"Abatzoglou, John T., and Timothy J. Brown (2012) A comparison of statistical downscaling methods suited for wildfire applications. International Journal of Climatology, 32 (5), pp 772-780. https://doi.org/10.1002/joc.2312",
"region":"north_america",
"region_id":"NAM",
"source":"MACA 3.0: Statistically-downscaled climate model projections based on CMIP6",
"source_name":"MACA",
"source_version_number":"3.0",
"title":"MACA 3.0 dataset prepared for DRCDP"
},

If you see any issues, please holler.

I also augmented the APday table to include hursmax, hursmin, rsds, sfcWind and tdps (dewpoint at 2m). For CMIP uas and vas is the 10m wind, I wasn't certain if that was an output that you wanted to provide, so didn't add it - but could if there is a desire.

@khegewisch
Copy link
Author

Okay. Good to know that tdps is the dewpoint at 2m.

We will have wind speed and wind direction at 10m (which were computed from the uas,vas).

I have a feeling that sfcWind is not at 10m ... so that will need to be changed. Do you know what the variable name should be for wind speed and direction at 10m?

@durack1
Copy link
Contributor

durack1 commented Feb 21, 2025

@khegewisch the definitions for sfcWind, uas and vas have all been notionally for the 10 m height, see

"comment":"near-surface (usually, 10 meters) wind speed",
"dimensions":[
"longitude",
"latitude",
"time",
"height10m"
],

And not currently included in this repo, but for CMIP6Plus, uas is also noted as "comment": "Eastward component of the near-surface (usually, 10 meters) wind". We could redefine the vertical coordinate for 2 m or similar, but we'd want to make sure this is a variable that many groups produce, not just MACA3-0.

@paullric
Copy link
Collaborator

paullric commented Feb 21, 2025 via email

@durack1
Copy link
Contributor

durack1 commented Feb 21, 2025

To confirm, we’d need to apply some math to the sfcWind and wind direction to obtain uas and vas, correct?

According to https://climate.northwestknowledge.net/MACA/data_csv.php, the uas and vas are already available outputs, at least for the MACAv2-METDATA product

@khegewisch
Copy link
Author

@durack1 @paullric You guys are getting confused with MACAv2. MACAv2 is already out there for CMIP5.
The product we are submitting to your project is MACAv3 on CMIP6... which is not out there yet. We are not submitting MACAv2 to your project because it's for CMIP5.

We were going to supply wind speed and direction for MACAv3. The wind speed we have is slightly better than just the wind speed you would calcuate from the average wind components uas/vas.

Are you suggesting that you would want uas/vas instead for this project?

@durack1
Copy link
Contributor

durack1 commented Feb 21, 2025

@khegewisch, in short, yep.

The uas, vas outputs rather than the derived wind speed and direction map far more neatly into MIP land.. We obviously have to be careful about the height dimension, if you have taken the CMIP6 (~10 m) outputs and remapped these for a different height then this needs to be correctly identified.

I note I am only speaking with a data preparation and publication hat on @paullric can speak to the data usage needs

@khegewisch
Copy link
Author

@durack1 @paullric

No we didn't remap the CMIP6 10m uas/vas to a different height. We used 10 m winds from our training dataset for the statistical downscaling.

In the end, most users want wind speed. Our downscaled wind speed product is better than the wind speed you would get from computing it from the average daily components uas/vas. It would be sad to not dispense our wind speed product in favor of uas/vas.

I guess we can contribute uas/vas to you guys.. but dispense our wind speed product on our own.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants