Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAM4xx: Fixing the ne120 case for the microphysics interface. #3095

Merged

Conversation

odiazib
Copy link
Contributor

@odiazib odiazib commented Nov 5, 2024

We are fixing a couple of issues encountered when running with ne120 for the vertical emission NetCDF files. Note that for resolutions finer than ne30, the case requires a remap file to perform horizontal interpolation. Our standalone test and CIME case did not catch this issue because, in these instances, we are not employing a remap file, so there is no horizontal interpolation.

The main issue was that the vertical emission files mam4xx is using have dimension names altitude and altitude_int instead of lev and ilev. Thus, we encountered an IO error due to a mismatch between layouts.

To fix this issue, we modified the source grid tag for LEV and ILEV using:

horiz_interp_src_grid->reset_field_tag_name(LEV, "altitude");
horiz_interp_src_grid->reset_field_tag_name(ILEV, "altitude_int");

Details

  • Could you please document in this PR the differences between the different file types in the TracerFileType enum? This should be done with a call like ncdump -h <file> where <file> will be one example of each type (FORMULA_PS, ZONAL, VERT_EMISSION, NONE). It's not clear to me why you're branching the logic, but maybe I am missing something in the file types. The ncdump details will clarify that.

@mahf708

The distinction among the file types FORMULA_PS, ZONAL, and VERT_EMISSION lies in their approach to vertical interpolation. The fourth enum, NONE, was introduced as a 'none' type. In EAM, these files utilize latitude and longitude to represent the horizontal dimensions. However, we opted to use ncremap to convert latitude/longitude files to the ne grid. More details can be found here

Therefore, while FORMULA_PS, VERT_EMISSION, and ZONAL types share the same method for horizontal interpolation, they employ distinct vertical interpolations. Although I examined how SPA performs vertical interpolation, I chose to directly port the vertical interpolation routines from EAM because these routines are compact and straightforward to transfer. Moreover, adopting the same vertical interpolation techniques in both EAM and EAMxx for mam4xx will simplify the code evaluation process.

FORMULA_PS
The files corresponding to FORMULA_PS include invariant tracers such as O3, OH, NO3, and HO2. For this type of file, the source pressure is computed using the following equation.

Vertical interpolation is performed using the ported vert_interp routine, available here.

The logic to detect this file type checks if the PS variable is present in the nc file; see here

This file is located here:
${DIN_LOC_ROOT}/atm/scream/mam4xx/invariants/ne4pg2/oxid_1.9x2.5_L26_1850-2015_ne4pg2_c20241009.nc

ncdump -h oxid_ne2np4_L26_1850-2015_c20240827.nc

netcdf oxid_ne2np4_L26_1850-2015_c20240827 {
dimensions:
	ncol = 218 ;
	nv = 5 ;
	time = UNLIMITED ; // (216 currently)
	lev = 26 ;
	ilev = 27 ;
variables:
	double lat(ncol) ;
		lat:long_name = "Latitude of Grid Cell Centers" ;
		lat:standard_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
		lat:valid_min = -90. ;
		lat:valid_max = 90. ;
		lat:bounds = "lat_vertices" ;
	double lon(ncol) ;
		lon:long_name = "Longitude of Grid Cell Centers" ;
		lon:standard_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
		lon:valid_min = 0. ;
		lon:valid_max = 360. ;
		lon:bounds = "lon_vertices" ;
	double lat_vertices(ncol, nv) ;
		lat_vertices:long_name = "Gridcell latitude vertices" ;
	double lon_vertices(ncol, nv) ;
		lon_vertices:long_name = "Gridcell longitude vertices" ;
	double area(ncol) ;
		area:long_name = "Solid angle subtended by gridcell" ;
		area:standard_name = "solid_angle" ;
		area:units = "steradian" ;
		area:coordinates = "lat lon" ;
		area:cell_mathods = "lat, lon: sum" ;
	float H2O2(time, ncol, lev) ;
		H2O2:units = "mol/mol" ;
		H2O2:long_name = "H2O2 concentration" ;
		H2O2:cell_method = "time: mean" ;
		H2O2:coordinates = "lat lon" ;
		H2O2:cell_measures = "area: area" ;
	float HO2(time, ncol, lev) ;
		HO2:units = "mol/mol" ;
		HO2:long_name = "HO2 concentration" ;
		HO2:cell_method = "time: mean" ;
		HO2:coordinates = "lat lon" ;
		HO2:cell_measures = "area: area" ;
	float NO3(time, ncol, lev) ;
		NO3:units = "mol/mol" ;
		NO3:long_name = "NO3 concentration" ;
		NO3:cell_method = "time: mean" ;
		NO3:coordinates = "lat lon" ;
		NO3:cell_measures = "area: area" ;
	float O3(time, ncol, lev) ;
		O3:units = "mol/mol" ;
		O3:long_name = "O3 concentration" ;
		O3:cell_method = "time: mean" ;
		O3:coordinates = "lat lon" ;
		O3:cell_measures = "area: area" ;
	float OH(time, ncol, lev) ;
		OH:units = "mol/mol" ;
		OH:long_name = "OH concentration" ;
		OH:cell_method = "time: mean" ;
		OH:coordinates = "lat lon" ;
		OH:cell_measures = "area: area" ;
	double P0 ;
		P0:units = "Pa" ;
		P0:long_name = "reference pressure" ;
	float PS(time, ncol) ;
		PS:units = "Pa" ;
		PS:long_name = "Surface pressure" ;
		PS:cell_method = "time: mean" ;
		PS:coordinates = "lat lon" ;
		PS:cell_measures = "area: area" ;
	int date(time) ;
		date:long_name = "date" ;
		date:format = "YYYYMMDD" ;
	int datesec(time) ;
		datesec:long_name = "current seconds of current date" ;
	double hyai(ilev) ;
		hyai:long_name = "hybrid A coefficient at layer interfaces" ;
	double hyam(lev) ;
		hyam:long_name = "hybrid A coefficient at layer midpoints" ;
	double hybi(ilev) ;
		hybi:long_name = "hybrid B coefficient at layer interfaces" ;
	double hybm(lev) ;
		hybm:long_name = "hybrid B coefficient at layer midpoints" ;
	double ilev(ilev) ;
		ilev:units = "level" ;
		ilev:long_name = "hybrid level at interfaces (1000*(A+B))" ;
		ilev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
		ilev:formula_terms = "a: hyai b: hybi p0: P0 ps: PS" ;
		ilev:positive = "down" ;
	double lev(lev) ;
		lev:units = "level" ;
		lev:long_name = "hybrid level at midpoints (1000*(A+B))" ;
		lev:standard_name = "atmosphere_hybrid_sigma_pressure_coordinate" ;
		lev:formula_terms = "a: hyam b: hybm p0: P0 ps: PS" ;
		lev:positive = "down" ;
	double time(time) ;
		time:units = "days since 1849-01-01 00:00" ;
		time:long_name = "time" ;
		time:calendar = "noleap" ;
		time:bounds = "time_bnds" ;

// global attributes:
		:description = "E3SM atmosphere oxidants input file with updated CMIP6 ozone" ;
		:author = "Chris Golaz" ;
		:date = "20171110" ;
		:script_name = "swapCMIP6ozone.py" ;
		:script_location = "https://gist.github.com/golaz/e4337ada9882db68fb094fe64ff9a150" ;
		:input_files = "oxid_1.9x2.5_L26_1850-2005_c091123.nc (60c5f32b7f702de1939f7194680a1d9e)\n",
			"vmro3_input4MIPs_ozone_CMIP_UReading-CCMI-1-0_gr_185001-189912.nc (83be223a8b2fecf5774947f896163def)\n",
			"vmro3_input4MIPs_ozone_CMIP_UReading-CCMI-1-0_gr_190001-194912.nc (357bfcc135cfee32e218cf4bc3596153)\n",
			"vmro3_input4MIPs_ozone_CMIP_UReading-CCMI-1-0_gr_195001-199912.nc (95bf517ff65c942856c23ddc4af9088f)\n",
			"vmro3_input4MIPs_ozone_CMIP_UReading-CCMI-1-0_gr_200001-201412.nc (88b570f20cd3027763209520136d950f)\n",
			"" ;
		:history = "Tue Aug 27 15:13:30 2024: ncks -O -t 2 --no_tmp_fl --hdr_pad=10000 --gaa remap_script=ncremap --gaa remap_hostname=login40 --gaa remap_version=5.1.6 --rgr lat_nm_out=lat --rgr lon_nm_out=lon --map_fl=/global/cfs/cdirs/m3525/meng/share/remap/map_96x144_to_ne2np4_nco.c20240722.nc /global/homes/o/odiazib/grids/share/inv/oxid_1.9x2.5_L26_1850-2015_c20181106.nc oxid_ne2np4_L26_1850-2015_c20240827_temp.nc\n",
			"Tue Nov  6 10:39:59 2018: ncks -O -3 --glb E3SM_input_conversion=20181106: Format converted from netcdf4-classic to classic (netCDF3) for compatibility with PnetCDF reader. No data were changed /lcrc/group/acme/public_html/inputdata/atm/cam/chem/trop_mozart_aero/oxid/oxid_1.9x2.5_L26_1850-2015_c20171110.nc inputdata/atm/cam/chem/trop_mozart_aero/oxid/oxid_1.9x2.5_L26_1850-2015_c20181106.nc" ;
		:E3SM_input_conversion = "20181106: Format converted from netcdf4-classic to classic (netCDF3) for compatibility with PnetCDF reader. No data were changed" ;
		:NCO = "netCDF Operators version 5.1.6 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ;
		:remap_script = "ncremap" ;
		:remap_hostname = "login40" ;
		:remap_version = "5.1.6" ;
		:nco_openmp_thread_number = 2 ;
		:title = "Regridded version of oxid_1.9x2.5_L26_1850-2015_c20181106.nc" ;
		:map_file = "/global/cfs/cdirs/m3525/meng/share/remap/map_96x144_to_ne2np4_nco.c20240722.nc" ;
		:input_file = "/global/homes/o/odiazib/grids/share/inv/oxid_1.9x2.5_L26_1850-2015_c20181106.nc" ;
}

VERT_EMISSION
Vertical interpolation is performed by the ported 'rebin' routine: here

To select this file, the code looks for the 'altitude' dimension in the nc file; see here

This type of file corresponds to vertical emission files, where a file for each species is provided. For example:
${DIN_LOC_ROOT}/atm/scream/mam4xx/emissions/ne30pg2/elevated/cmip6_mam4_so2_elev_1x1_2010_clim_ne30pg2_c20241008.nc

ncdump -h cmip6_mam4_so2_elev_ne2np4_2010_clim_c20240726.nc

netcdf cmip6_mam4_so2_elev_ne2np4_2010_clim_c20240726 {
dimensions:
	ncol = 218 ;
	nv = 5 ;
	time = UNLIMITED ; // (12 currently)
	altitude = 13 ;
	altitude_int = 14 ;
variables:
	double lat(ncol) ;
		lat:long_name = "Latitude of Grid Cell Centers" ;
		lat:standard_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
		lat:valid_min = -90. ;
		lat:valid_max = 90. ;
		lat:bounds = "lat_vertices" ;
	double lon(ncol) ;
		lon:long_name = "Longitude of Grid Cell Centers" ;
		lon:standard_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
		lon:valid_min = 0. ;
		lon:valid_max = 360. ;
		lon:bounds = "lon_vertices" ;
	double lat_vertices(ncol, nv) ;
		lat_vertices:long_name = "Gridcell latitude vertices" ;
	double lon_vertices(ncol, nv) ;
		lon_vertices:long_name = "Gridcell longitude vertices" ;
	double area(ncol) ;
		area:long_name = "Solid angle subtended by gridcell" ;
		area:standard_name = "solid_angle" ;
		area:units = "steradian" ;
		area:coordinates = "lat lon" ;
		area:cell_mathods = "lat, lon: sum" ;
	float BB(time, ncol, altitude) ;
		BB:units = "molecules/cm3/s" ;
		BB:cell_methods = "time: mean" ;
		BB:coordinates = "lat lon" ;
		BB:cell_measures = "area: area" ;
	float ENE_ELEV(time, ncol, altitude) ;
		ENE_ELEV:units = "molecules/cm3/s" ;
		ENE_ELEV:cell_methods = "time: mean" ;
		ENE_ELEV:coordinates = "lat lon" ;
		ENE_ELEV:cell_measures = "area: area" ;
	float IND_ELEV(time, ncol, altitude) ;
		IND_ELEV:units = "molecules/cm3/s" ;
		IND_ELEV:cell_methods = "time: mean" ;
		IND_ELEV:coordinates = "lat lon" ;
		IND_ELEV:cell_measures = "area: area" ;
	double altitude(altitude) ;
		altitude:units = "km" ;
		altitude:long_name = "altitude midlevel" ;
	double altitude_int(altitude_int) ;
		altitude_int:units = "km" ;
		altitude_int:long_name = "altitude interval" ;
	float contvolc(time, ncol, altitude) ;
		contvolc:units = "molecules/cm3/s" ;
		contvolc:cell_methods = "time: mean" ;
		contvolc:coordinates = "lat lon" ;
		contvolc:cell_measures = "area: area" ;
	double date(time) ;
		date:cell_methods = "time: mean" ;

// global attributes:
		:history = "Fri Jul 26 15:37:55 2024: ncks -O -t 2 --no_tmp_fl --hdr_pad=10000 --gaa remap_script=ncremap --gaa remap_hostname=login17 --gaa remap_version=5.1.6 --rgr lat_nm_out=lat --rgr lon_nm_out=lon --map_fl=/global/cfs/cdirs/m3525/meng/share/remap/map_96x144_to_ne2np4_nco.c20240722.nc /global/cfs/cdirs/e3sm/inputdata/atm/cam/chem/trop_mozart_aero/emis/DECK_ne30//cmip6_mam4_so2_elev_1x1_2010_clim_c20190821.nc cmip6_mam4_so2_elev_1x1_2010_clim_c20190821.nc_output.nc\n",
			"Wed Aug 21 12:35:57 2019: ncatted -O -a E3SM_1,global,a,c,2005-2014 monthly climatology derived by Qi Tang ([email protected]) for E3SM 2010 F-compsets. tmp3.nc\n",
			"Wed Aug 21 12:35:56 2019: ncks -O -3 --glb E3SM_input_conversion=Format converted from netCDF4 to classic (netCDF3) for compatibility with PnetCDF reader. No data were changed. tmp2.nc tmp3.nc\n",
			"Wed Aug 21 12:35:55 2019: ncap2 -O -s date=date+5000 monthly_climo.nc tmp2.nc\n",
			"Wed Aug 21 12:35:47 2019: ncra -F -d time,01,,12 tmp.nc temp01.nc\n",
			"Wed Aug 21 12:35:35 2019: ncks -O -d time,1872,1991 cmip6_mam4_so2_elev_1850-2014_c170525.nc tmp.nc\n",
			"Created on 05/25/2017 at Pacific Northwest National Lab. Contact: [email protected]" ;
		:NCO = "netCDF Operators version 5.1.6 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco)" ;
		:E3SM_input_conversion = "Format converted from netCDF4 to classic (netCDF3) for compatibility with PnetCDF reader. No data were changed." ;
		:E3SM_1 = "2005-2014 monthly climatology derived by Qi Tang ([email protected]) for E3SM 2010 F-compsets." ;
		:remap_script = "ncremap" ;
		:remap_hostname = "login17" ;
		:remap_version = "5.1.6" ;
		:nco_openmp_thread_number = 2 ;
		:title = "Regridded version of cmip6_mam4_so2_elev_1x1_2010_clim_c20190821.nc" ;
		:map_file = "/global/cfs/cdirs/m3525/meng/share/remap/map_96x144_to_ne2np4_nco.c20240722.nc" ;
		:input_file = "/global/cfs/cdirs/e3sm/inputdata/atm/cam/chem/trop_mozart_aero/emis/DECK_ne30//cmip6_mam4_so2_elev_1x1_2010_clim_c20190821.nc" ;
}

ZONAL
Vertical interpolation is performed using the ported vert_interp routine, available here.

The key difference between the FORMULA_PS and ZONAL types in our implementation is that the former does not utilize the PS variable to compute source pressure; instead, the source pressure is provided directly in the nc file.

I have named this file type 'zonal,' reflecting its usage in the EAM code as 'zonal_ave'. see here

This file is located here:
${DIN_LOC_ROOT}/atm/scream/mam4xx/linoz/ne4pg2/linoz1850-2015_2010JPL_CMIP6_10deg_58km_ne4pg2_c20240724.nc

ncdump -h linoz1850-2015_2010JPL_CMIP6_10deg_58km_ne2np4_c20240724.nc

netcdf linoz1850-2015_2010JPL_CMIP6_10deg_58km_ne2np4_c20240724 {
dimensions:
	ncol = 218 ;
	nv = 5 ;
	time = 420 ;
	lev = 27 ;
	ilev = 28 ;
variables:
	double lat(ncol) ;
		lat:long_name = "Latitude of Grid Cell Centers" ;
		lat:standard_name = "latitude" ;
		lat:units = "degrees_north" ;
		lat:axis = "Y" ;
		lat:valid_min = -90. ;
		lat:valid_max = 90. ;
		lat:bounds = "lat_vertices" ;
	double lon(ncol) ;
		lon:long_name = "Longitude of Grid Cell Centers" ;
		lon:standard_name = "longitude" ;
		lon:units = "degrees_east" ;
		lon:axis = "X" ;
		lon:valid_min = 0. ;
		lon:valid_max = 360. ;
		lon:bounds = "lon_vertices" ;
	double lat_vertices(ncol, nv) ;
		lat_vertices:long_name = "Gridcell latitude vertices" ;
	double lon_vertices(ncol, nv) ;
		lon_vertices:long_name = "Gridcell longitude vertices" ;
	double area(ncol) ;
		area:long_name = "Solid angle subtended by gridcell" ;
		area:standard_name = "solid_angle" ;
		area:units = "steradian" ;
		area:coordinates = "lat lon" ;
		area:cell_mathods = "lat, lon: sum" ;
	float PmL_clim(time, ncol, lev) ;
		PmL_clim:long_name = "P minus L (climatology)" ;
		PmL_clim:units = "vmr/s" ;
		PmL_clim:_FillValue = 1.e+36f ;
		PmL_clim:coordinates = "lat lon" ;
		PmL_clim:cell_measures = "area: area" ;
	float cariolle_pscs(time, ncol, lev) ;
		cariolle_pscs:long_name = "Cariolle parameter for PSC loss of ozone" ;
		cariolle_pscs:units = "per second" ;
		cariolle_pscs:_FillValue = 1.e+36f ;
		cariolle_pscs:coordinates = "lat lon" ;
		cariolle_pscs:cell_measures = "area: area" ;
	float dPmL_dO3(time, ncol, lev) ;
		dPmL_dO3:long_name = "Sensitivity of P minus L to O3" ;
		dPmL_dO3:units = "per second" ;
		dPmL_dO3:_FillValue = 1.e+36f ;
		dPmL_dO3:coordinates = "lat lon" ;
		dPmL_dO3:cell_measures = "area: area" ;
	float dPmL_dO3col(time, ncol, lev) ;
		dPmL_dO3col:long_name = "Sensitivity of P minus L to overhead O3 column" ;
		dPmL_dO3col:units = "vmr/DU" ;
		dPmL_dO3col:_FillValue = 1.e+36f ;
		dPmL_dO3col:coordinates = "lat lon" ;
		dPmL_dO3col:cell_measures = "area: area" ;
	float dPmL_dT(time, ncol, lev) ;
		dPmL_dT:long_name = "Sensitivity of P minus L to T" ;
		dPmL_dT:units = "vmr/K" ;
		dPmL_dT:_FillValue = 1.e+36f ;
		dPmL_dT:coordinates = "lat lon" ;
		dPmL_dT:cell_measures = "area: area" ;
	int date(time) ;
		date:long_name = "YYYYMMDDD" ;
		date:units = "date" ;
	int datesec(time) ;
		datesec:long_name = "seconds of day" ;
		datesec:units = "seconds" ;
	double ilev(ilev) ;
		ilev:long_name = "hybrid level pressure at interfaces (1000*(A+B))" ;
		ilev:units = "mbar" ;
		ilev:_FillValue = 1.e+36f ;
	double lev(lev) ;
		lev:long_name = "hybrid level pressure at midpoints (1000*(A+B))" ;
		lev:units = "mbar" ;
		lev:_FillValue = 1.e+36f ;
	float o3_clim(time, ncol, lev) ;
		o3_clim:long_name = "ozone (climatology)" ;
		o3_clim:units = "volume mixing ratio" ;
		o3_clim:_FillValue = 1.e+36f ;
		o3_clim:coordinates = "lat lon" ;
		o3_clim:cell_measures = "area: area" ;
	float o3col_clim(time, ncol, lev) ;
		o3col_clim:long_name = "Column O3 above box (climatology)" ;
		o3col_clim:units = "Dobson Units" ;
		o3col_clim:_FillValue = 1.e+36f ;
		o3col_clim:coordinates = "lat lon" ;
		o3col_clim:cell_measures = "area: area" ;
	float t_clim(time, ncol, lev) ;
		t_clim:long_name = "temperature (climatology)" ;
		t_clim:units = "K" ;
		t_clim:_FillValue = 1.e+36f ;
		t_clim:coordinates = "lat lon" ;
		t_clim:cell_measures = "area: area" ;
	float time(time) ;
		time:long_name = "days since 0000-01-01" ;
		time:units = "days" ;
		time:comment = "Assumes 365 days/year, 30 days/month (should be good enough for general plotting)" ;
		time:_FillValue = 1.e+36f ;

// global attributes:
		:history = "Wed Jul 24 15:55:58 2024: ncks -O -t 2 --no_tmp_fl --hdr_pad=10000 --gaa remap_script=ncremap --gaa remap_hostname=login22 --gaa remap_version=5.2.4 --rgr lat_nm_out=lat --rgr lon_nm_out=lon --map_fl=/pscratch/sd/m/meng/climo/maps/map_18x36_to_ne2np4_c20240724.nc /global/homes/m/meng/m3525/share/remap/tmp3.nc linoz1850-2015_2010JPL_CMIP6_10deg_58km_ne2np4_c20240724.nc\nWed Jul 24 15:55:41 2024: ncatted -a _FillValue,,m,f,1.0e36 tmp2.nc tmp3.nc" ;
		:NCO = "netCDF Operators version 5.2.4 (Homepage = http://nco.sf.net, Code = http://github.com/nco/nco, Citation = 10.1016/j.envsoft.2008.03.004)" ;
		:remap_script = "ncremap" ;
		:remap_hostname = "login22" ;
		:remap_version = "5.2.4" ;
		:nco_openmp_thread_number = 2 ;
		:title = "Regridded version of tmp3.nc" ;
		:map_file = "/pscratch/sd/m/meng/climo/maps/map_18x36_to_ne2np4_c20240724.nc" ;
		:input_file = "/global/homes/m/meng/m3525/share/remap/tmp3.nc" ;
}

@singhbalwinder
Copy link
Contributor

Tagging @susburrows

Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be reworked.

  • Why don't you consider fixing the files so that they have the correct dimensions needed?
  • Did you check if the units align? I don't think LEV and altitude would have the same units ... but I could be wrong
  • If you want to impl this in code, I think you'd want to potentially pass some sort of a (old_name, new_name) map/dict for these renaming ops instead of overloading the functions
  • If you want to impl this in code, in the added logic, I don't think you should differentiate between remapping or not (you are not seeing its effect because the no-remapping branch is no-op, but fundamentally the setup is wrong and so the code should be applied to all)
  • it seems this had another bug, which makes me think, you should be adding a simple test that does the remapping (say from ne2 to ne4 or something)

@odiazib
Copy link
Contributor Author

odiazib commented Nov 5, 2024

I think this should be reworked.

  • Why don't you consider fixing the files so that they have the correct dimensions needed?

Because we are relying on the vertical interpolation from EAM that takes altitude instead of levels.

See here, which is invoked here.

  • Did you check if the units align? I don't think LEV and altitude would have the same units ... but I could be wrong

Because this part relies on horizontal interpolation, neither levels nor altitude should be involved in the computation. Note, the third dimension (for columns, I will use two dimensions even though in the code we only have one) is involved, but I do not think units will affect this step.

We followed the SPA code, where horizontal interpolation is done first. Then, vertical interpolation is computed on the resulting finer grid.

  • If you want to impl this in code, I think you'd want to potentially pass some sort of a (old_name, new_name) map/dict for these renaming ops instead of overloading the functions

I use overloading because we only use this part of the code for vertical emissions. Linoz and oxidant reads do not need to be involved. I am open to changing this code. Could you please provide more details on what you are proposing?

  • If you want to impl this in code, in the added logic, I don't think you should differentiate between remapping or not (you are not seeing its effect because the no-remapping branch is no-op, but fundamentally the setup is wrong and so the code should be applied to all)

I've created two code blocks for remapping and no remapping to avoid cloning the source grid in the no remapping case, where renaming the LEV tag is necessary.
For horizontal interpolation, we create a horizontal remapper object here.

This remapper object requires a target grid and a map file as input. In cases where no map files are needed, the target and source grid are the same object. We rename the LEV tag for the target grid here.

For this reason, you do not see the renaming of the LEV tag in the source grid.
The source grid, when provided with a map file, is created by RefiningRemapperP2P. I did not want to modify this code to address the issue of different dimension names. Thus, the simplest solution was to create a clone of the source grid and rename LEV.
I believe RefiningRemapperP2P should not depend on the names of dimensions for three-dimension. However, @bartgol will know better how to generalize this part of the code.

  • it seems this had another bug, which makes me think, you should be adding a simple test that does the remapping (say from ne2 to ne4 or something)

Yes, we can add another test. @singhbalwinder , could you please tell me where I can find a map file for the horizontal interpolation from ne2 to ne4?

@mahf708
Copy link
Contributor

mahf708 commented Nov 5, 2024

I think this should be reworked.

  • Why don't you consider fixing the files so that they have the correct dimensions needed?

Because we are relying on the vertical interpolation from EAM that takes altitude instead of levels.

See here, which is invoked here.

I meant: if I understand your code, all your code is doing is renaming the dims. Why not rename the dims in the input files? Then, most of this PR won't be needed.

@odiazib
Copy link
Contributor Author

odiazib commented Nov 6, 2024

I think this should be reworked.

  • Why don't you consider fixing the files so that they have the correct dimensions needed?

Because we are relying on the vertical interpolation from EAM that takes altitude instead of levels.
See here, which is invoked here.

I meant: if I understand your code, all your code is doing is renaming the dims. Why not rename the dims in the input files? Then, most of this PR won't be needed.

Because altitude and lev represent different quantities, as you mentioned, their units differ: altitude is measured in km, while lev is in 'hPa'. Thus, using lev with units of km would appear strange.

mahf708
mahf708 previously requested changes Nov 6, 2024
Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will defer to Luca

I think this PR requires two things before completion:

  • you should add a test for this, given that the initial impl was wrong and buggy; a test will catch stuff
  • i would simplify the code to make it easier to maintain in the long run, i don't like the unnecessary overloading (but that's likely a personal opinion) and even if you chose to keep overloading, I think you have to do it more logically than by branching arbitrarily due to the presence of a map or not

Comment on lines 461 to 476
const auto io_grid = horiz_remapper->get_src_grid();

// NOTE: If we are using a vertical emission NC file with altitude instead of levels,
// we must rename this tag. This is only necessary when a map file is used.
if(tracer_data.file_type == VERT_EMISSION && extfrc_map_file != ""){
auto horiz_interp_src_grid =
io_grid->clone("tracer_horiz_interp_src_grid", true);
horiz_interp_src_grid->reset_field_tag_name(LEV, "altitude");
horiz_interp_src_grid->reset_field_tag_name(ILEV, "altitude_int");
return std::make_shared<AtmosphereInput>(tracer_data_file, horiz_interp_src_grid, io_fields,
true);
} else {
return std::make_shared<AtmosphereInput>(tracer_data_file, io_grid, io_fields,
true);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i still don't think you should overload, see my comments previously --- just add this stuff to the other function with some extra args

if you still want to, consider this change

Suggested change
const auto io_grid = horiz_remapper->get_src_grid();
// NOTE: If we are using a vertical emission NC file with altitude instead of levels,
// we must rename this tag. This is only necessary when a map file is used.
if(tracer_data.file_type == VERT_EMISSION && extfrc_map_file != ""){
auto horiz_interp_src_grid =
io_grid->clone("tracer_horiz_interp_src_grid", true);
horiz_interp_src_grid->reset_field_tag_name(LEV, "altitude");
horiz_interp_src_grid->reset_field_tag_name(ILEV, "altitude_int");
return std::make_shared<AtmosphereInput>(tracer_data_file, horiz_interp_src_grid, io_fields,
true);
} else {
return std::make_shared<AtmosphereInput>(tracer_data_file, io_grid, io_fields,
true);
}
auto io_grid = horiz_remapper->get_src_grid();
// NOTE: we must rename this tag to be consistent
io_grid->reset_field_tag_name(LEV, "altitude");
io_grid->reset_field_tag_name(ILEV, "altitude_int");
return std::make_shared<AtmosphereInput>(tracer_data_file, io_grid, io_fields,
true);

Even better, if you could just add this stuff in the first call as a map

inline std::shared_ptr<AtmosphereInput> create_tracer_data_reader(
    const std::shared_ptr<AbstractRemapper> &horiz_remapper,
    const std::string &tracer_data_file
    std::map<FieldTag, std::string>* some_map = nullptr) {
  std::vector<Field> io_fields;
  for(int i = 0; i < horiz_remapper->get_num_fields(); ++i) {
    io_fields.push_back(horiz_remapper->get_src_field(i));
  }
  auto io_grid = horiz_remapper->get_src_grid();
  if (some_map) {
    for (const auto& [key, value] : *some_map) {
        io_grid->reset_field_tag_name(key, value);
    }
  }
  return std::make_shared<AtmosphereInput>(tracer_data_file, io_grid, io_fields,
                                           true);
}  // create_tracer_data_reader

I am assuming this stuff isn't happening on device, and you have to double check the code

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He can't change the remapper src grid, since it's const. He'd have to (shallow) clone it first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, forgot that ... in the above:

-  auto io_grid = horiz_remapper->get_src_grid();
+  auto io_grid = horiz_remapper->get_src_grid().clone(...);

Copy link
Contributor Author

@odiazib odiazib Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I committed a cleaner version of this part of the code. I will clone only the source grid if the file type corresponds to a file with altitude, i.e., the VERT_EMISSION file type. I did not use a map because it would require additional changes in the interface code (I need to create a map). I deleted the overload of create_tracer_data_reader by passing the file type as input and setting the file type default to NONE.

@@ -306,7 +308,7 @@ void MAMMicrophysics::set_grids(
// I am assuming the order of species in extfrc_lst_.
// Indexing in mam4xx is fortran.
forcings_[i].frc_ndx = i + 1;
const auto io_grid_emis = VertEmissionsHorizInterp_[i]->get_src_grid();
const auto io_grid_emis = VertEmissionsHorizInterp_[i]->get_tgt_grid();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you should add a test

Copy link
Contributor Author

@odiazib odiazib Nov 7, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To create a test, I will need to have a map file. Is it possible to include this test in a future PR? Note that the evaluation team is running additional cases that allow us to catch issues.

bartgol
bartgol previously approved these changes Nov 6, 2024
Copy link
Contributor

@bartgol bartgol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is fine. In the devs/IO group we are working on unifying the way we read data from file, so I would not spend too much time in polishing how your read data. So long as it works, and the details/gotchas are clear (e.g., inline comments), then I'm fine.

Comment on lines 461 to 476
const auto io_grid = horiz_remapper->get_src_grid();

// NOTE: If we are using a vertical emission NC file with altitude instead of levels,
// we must rename this tag. This is only necessary when a map file is used.
if(tracer_data.file_type == VERT_EMISSION && extfrc_map_file != ""){
auto horiz_interp_src_grid =
io_grid->clone("tracer_horiz_interp_src_grid", true);
horiz_interp_src_grid->reset_field_tag_name(LEV, "altitude");
horiz_interp_src_grid->reset_field_tag_name(ILEV, "altitude_int");
return std::make_shared<AtmosphereInput>(tracer_data_file, horiz_interp_src_grid, io_fields,
true);
} else {
return std::make_shared<AtmosphereInput>(tracer_data_file, io_grid, io_fields,
true);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

He can't change the remapper src grid, since it's const. He'd have to (shallow) clone it first.

@odiazib
Copy link
Contributor Author

odiazib commented Nov 7, 2024

@bartgol @mahf708 Thank you for reviewing this PR and for your valuable recommendations. May we proceed with merging this PR?

bartgol
bartgol previously approved these changes Nov 7, 2024
Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am changing my review to "Comment" so that you can feel free to override and merge. However, I am not a fan of this PR, because I think it's covering sloppy code by more sloppy code. That's not a sustainable strategy. You asked me to review it, and I am giving you my reasons why I don't like it. I think my reasons are well-intentioned and logical; and I am not convinced my concerns have been addressed.

Maybe there's miscommunication, so here's another way to request simple clarifications related to my review:

  • Could you please document in this PR the differences between the different file types in the TracerFileType enum? This should be done with a call like ncdump -h <file> where <file> will be one example of each type (FORMULA_PS, ZONAL, VERT_EMISSION, NONE). It's not clear to me why you're branching the logic, but maybe I am missing something in the file types. The ncdump details will clarify that.
  • I still don't understand why you're resisting adding a test. I think you should. You have three options (from least desirable to more desirable):
    • Document clearly in this PR a way to reproduce the failure before the PR, and how the fixes in this PR actually address the failure. Ideally, this should be something like a script from your testing.
    • Add a testmod where the ne120 case (in the title of the PR) can be run with a simple cime call. The full test name could then be something like SMS_D_Ln5.ne120pg2_ne120pg2.F2010-SCREAMv1.$machine_$compiler.scream-mam4xx-micro-remap. This test won't be added to tests we run regularly.
    • Add a testmod where a simpler ne2pg2/ne4pg2 can be run. This could be something like: SMS_D_Ln5.ne4pg2_oQU480.F2010-SCREAMv1-MPASSI.chrysalis_intel.scream-mam4xx-micro-remap. An easy approach is to simply find a remap file that can remap ne30pg2 files to ne4pg2 and then you have everything ready. I don't think we allow coarsening for these, so the test will require some more work, but it's not difficult.

@singhbalwinder
Copy link
Contributor

Thanks, Naser, for re-explaining everything in detail. We will address your concerns before merging it. Here is how we are going to address all your concerns, please let us know if you need anything else.

Could you please document in this PR the differences between the different file types in the TracerFileType enum? This should be done with a call like ncdump -h where will be one example of each type (FORMULA_PS, ZONAL, VERT_EMISSION, NONE). It's not clear to me why you're branching the logic, but maybe I am missing something in the file types. The ncdump details will clarify that.

  • We will include the ncdumps in the PR description to highlight the tags that are different.

You have three options (from least desirable to more desirable):

  • We will pick the most desirable option.

Add a testmod where a simpler ne2pg2/ne4pg2 can be run.

  • We can add a short 2-time step ne30pg2 test and provide it with a ne4pg2 file and a ne4pg2->ne30pg2 remapping file. The test will remap ne4pg2 file to ne30pg2 grid during the simulation. The test will be part of the nightly testing to avoid overloading AT. Would that be okay?

@mahf708 mahf708 dismissed their stale review November 8, 2024 02:27

so that authors can override request for edits if desired

Copy link
Contributor

mergify bot commented Nov 9, 2024

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟠 Enforce checks passing

Waiting checks: cpu-gcc / .*, gcc-cuda / .*, gcc-openmp / .*.

Make sure that checks are not failing on the PR, and reviewers approved

  • any of:
    • check-skipped="gcc-openmp / .*"
    • check-success="gcc-openmp / .*"
  • any of:
    • check-skipped="gcc-cuda / .*"
    • check-success="gcc-cuda / .*"
  • any of:
    • check-skipped="cpu-gcc / .*"
    • check-success="cpu-gcc / .*"
  • #approved-reviews-by >= 1
  • #changes-requested-reviews-by == 0
  • any of:
    • check-skipped=cpu-gcc
    • check-success=cpu-gcc

mahf708
mahf708 previously approved these changes Nov 9, 2024
Copy link
Contributor

@mahf708 mahf708 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for providing the header info from the files.

I wrote a lengthy comment about the confusion. You're encouraged to edit the misleading comments and names.

I also think you should add a test of your choosing so that when we rewrite all this IO stuff we won't break your setup.

I won't have time to re-review this PR again, so feel free to do what you want with my feedback. I am approving and there's no need for my review/approval again.

true);
} else{
// We do not need to rename tags in or clone io_grid for other types of files.
Copy link
Contributor

@mahf708 mahf708 Nov 9, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is partly the confusion and that's why I was unhappy about this PR. Your comments here don't add up to what's going on. What you should actually say is the following: "we must not rename the tags". If you rename the tags here for the zonal or formula_ps files, you will end up with errors. (You will be to rename the tags for the surface/none type though.) So, your branching logic is justified, but that wasn't clear to me (because the notes here made it sound it was just a preference; if it were just a preference, it'd simply be bad code imo --- we should only branch if we have to)

Based on the info you provided in the main post, I believe the following is the best way to describe this messy situation. Let's take your enum for illustration:

enum TracerFileType {
- // file with PS ncol, lev, and time
+ // file with ncol, lev, ilev, time and has P0 and PS as variables
+ // example: oxidants
  FORMULA_PS,
- // nc zonal file from ncremap
+ // file with ncol, lev, ilev, time
+ // example: linoz
  ZONAL,
- // vertical emission files
+ // file with ncol, altitude, altitude_int, time
+ // example: elevated (at a height) emissions of aerosols and precursors
+ // NOTE: we must rename the default vertical tags when horiz remapping
+ // NOTE: we vert remap in a different routine in mam4xx
  VERT_EMISSION,
- //
+ // file with ncol, time
+ // example: surface emissions of aerosols and precursors
  NONE
};

Note I referred to the "VERT_EMISSION" files as they are actually named "elevated" (that's in the name of the files). I think it would have been good to keep that terminology from EAM. I believe the "NONE" type includes the surface emission. In the messy namelist world of EAM, we roughly call these "ext_frc_" (VERT_) and "srf_emis_" (NONE). So, I think you should actually change the "NONE" type to something more meaningful. I would encourage you to check with people like Balwinder, Kai, and Mingxuan.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mingxuanwupnnl explained very well why to use the term "external forcings" for emissions (as they are applied like forcing in the equations). I think NONE here means a tracer file with no vertical dimension (e.g., surface emissions). We can rename NONE to SURFACE, but NONE should also be fine as it would mean a tracer file with a NONE vertical extent.

Copy link
Contributor Author

@odiazib odiazib Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can rename the enums. Note that we are not using the TracerFileType enum for surface emissions. The reader for surface emissions was implemented by @singhbalwinder. We worked in parallel on this, so we ended up with duplicate code. Because @bartgol is working on a general reader, it is better to just keep NONE, and I do not think we should add a reference for surface emissions. I will update the comments in this part of the code, as recommended by @mahf708.

@singhbalwinder, is this change okay: renaming VERT_EMISSION to ELEVATED_EMISSION? Or do we want to keep it as EXT_FRC?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's change it to ELEVATED_EMISSIONS. I would like to keep a reference to "emissions" to be clear that these are actually emissions.

@singhbalwinder singhbalwinder added EAGLES Work related to the EAGLES aerosol project MAM labels Nov 9, 2024
@singhbalwinder singhbalwinder dismissed stale reviews from mahf708 and bartgol via c07bc72 November 10, 2024 00:41
@singhbalwinder
Copy link
Contributor

I have added a test mod that must be run with an NE30 grid. This test forces the NE30 grid to use NE4 emission files. A remapping file is provided that remaps NE4 emission files to NE30.

We should probably revisit it later. I do not like restricting test mods to model resolution; they should be independent.

@odiazib odiazib requested a review from mjs271 November 11, 2024 17:16
mjs271
mjs271 previously approved these changes Nov 11, 2024
@odiazib
Copy link
Contributor Author

odiazib commented Nov 11, 2024

Thanks for providing the header info from the files.

Thank you for your detailed review and comments.

I also think you should add a test of your choosing so that when we rewrite all this IO stuff we won't break your setup.

The single process test for microphysics and its baseline should catch DIFFs for the tracer reader. These tests were added in PR #3013

@odiazib odiazib requested a review from mjs271 November 11, 2024 19:25
@bartgol bartgol merged commit 4afd0af into E3SM-Project:master Nov 12, 2024
18 of 19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
EAGLES Work related to the EAGLES aerosol project MAM
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants