Skip to content

Commit

Permalink
Merge pull request #27 from EcoExtreML/refactor_csv_to_nc
Browse files Browse the repository at this point in the history
Refactor scripts of converting csv to nc files
  • Loading branch information
SarahAlidoost authored Apr 12, 2022
2 parents 73a239d + 0dc03f5 commit c539978
Show file tree
Hide file tree
Showing 9 changed files with 394 additions and 238 deletions.
11 changes: 7 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -186,8 +186,10 @@ Dutch National supercomputer hosted at SURF.
half-hour time step i.e. `365*24*2=17520`.

To edit the config file, open the file with a text editor and change the
paths. The variable name e.g. `SoilPropertyPath` should not be changed.
Also, note a `/` is required at the end of each line.
paths. The `InputPath` and `OutputPath` are user-defined directories, make
sure they exist and you have right permissions. The variable name e.g.
`SoilPropertyPath` should not be changed. Also, note a `/` is required at
the end of each line.

As explained above, the "InputPath" directory of the model is considered as
the working/running directory and should include some data required by the
Expand Down Expand Up @@ -296,8 +298,9 @@ See the [exe readme](./exe/README.md).

## Preparing the outputs of the model in NetCDF:

There are some files in `utils` directory in this repository. The utils are used
to read `.csv` files and save them in `.nc` format.
There is some files in utils directory in this repository. The utils are used to
read `.csv` files and save them in `.nc` format. See [utils
readme](./utils/csv_to_nc/README.md).

> An example NetCDF file is stored in the project directory to show the desired
structure of variables in one file.
13 changes: 12 additions & 1 deletion run_stemmus_scope_snellius.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,10 @@ module load parallel/20210622-GCCcore-10.3.0
### 2. To transfer environment vars, functions, ...
source `which env_parallel.bash`

### python environment stemmus is needed to convert csv files to nc files
### see utils/csv_to_nc/README.md
. ~/mamba/bin/activate stemmus

### 3. Create a function to loop over
loop_func() {

Expand Down Expand Up @@ -78,7 +82,14 @@ loop_func() {
run_time=$(expr $end_time - $start_time)

## 3.9 Add some information to slurm*.out file later will be used.
echo "Run is COMPLETED. Model run time is $run_time s." >> $std_out
completed="COMPLETED"
echo "Run is $completed. Model run time is $run_time s." >> $std_out

## 3.10 Convert csv files to a nc file, if run is completed
if [[ -v completed ]];
then
python utils/csv_to_nc/generate_netcdf_files.py --config_file $station_config --variable_file utils/csv_to_nc/Variables_will_be_in_NetCDF_file.csv
fi
}

### 4. Create a log file for GNU parallel
Expand Down
56 changes: 56 additions & 0 deletions utils/csv_to_nc/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
# Converting `.csv` files to NetCDF files

Currently, model outputs are several files in `csv` format. The model output
should be converted to one netcedf file according to Plumber protocol. To do so,
there is a file
[Variables_will_be_in_NetCDF_file.csv](./Variables_will_be_in_NetCDF_file.csv.
The file lists variables that should be in the netcdf file. Also, there is a
python script [csv_to_nc.py](./csv_to_nc.py) that contains main fucntions. Below
we explain how to use the python scripts.

## 1. Create Conda environment

> We need to do this step only once.
We download and install Conda:

```sh
wget https://github.com/conda-forge/miniforge/releases/latest/download/Mambaforge-pypy3-Linux-x86_64.sh
bash Mambaforge-pypy3-Linux-x86_64.sh -b -p ~/mamba
```

Then, update base environment:

```sh
. ~/mamba/bin/activate
mamba update --name base mamba
```

Finally, we create new conda environment called 'stemmus' with all required dependencies:

```sh
cd STEMMUS_SCOPE/utils/csv_to_nc
mamba env create
```

## 2. Activate Conda environment

> We need to do this step before running our python scripts.
The environment can be activated with

```sh
. ~/mamba/bin/activate stemmus
```

## 3. Run python script

Open the configuration file [config_file_crib.txt][../../config_file_crib.txt]
or [config_file_snellius.txt][../../config_file_snellius.txt] and edit paths. Then,

```sh
cd STEMMUS_SCOPE
python utils/csv_to_nc/generate_netcdf_files.py --config_file config_file_crib.txt --variable_file utils/csv_to_nc/Variables_will_be_in_NetCDF_file.csv
```

This will generate `ECdata.csv` and a netcdf file related to model output.
53 changes: 27 additions & 26 deletions utils/csv_to_nc/Variables_will_be_in_NetCDF_file.csv
Original file line number Diff line number Diff line change
@@ -1,26 +1,27 @@
,pri_cmip,short_name_alma,short_name_cmip,standard_name,long_name,definition,unit,direction,dimension,grp_alma,grp_cmip,subgrid,Available in STEMMUS-SCOPE,File name,Variable name in STEMMUS-SCOPE,,
1,1,SWnet,rss,surface_net_downward_shortwave_flux,Net shortwave radiation,"Incoming solar radiation less the simulated outgoing shortwave radiation, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Netshort,,
1,1,LWnet,rls,surface_net_downward_longwave_flux,Net longwave radiation,"Incident longwave radiation less the simulated outgoing longwave radiation, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Netlong,,
,2,SWdown,rsds,surface_downwelling_shortwave_flux_in_air,Downward short-wave radiation,,W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Rin,,
,2,LWdown,rlds,surface_downwelling_longwave_flux_in_air,Downward long-wave radiation,,W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Rli,,
,2,SWup,rsus,surface_upwelling_shortwave_flux_in_air,Upward short-wave radiation,,W/m2,Upward,XYT,,LEday,,Yes,radiation.csv,HemisOutShort,,
2,2,LWup,rlus,surface_upwelling_longwave_flux_in_air,Upward long-wave radiation,This upward longwave flux is to be compared to an ISCCP derived product.,W/m2,Upward,XYT,,LEday,,Yes,radiation.csv,HemisOutLong,,
1,1,Qle,hfls,surface_upward_latent_heat_flux,Latent heat flux,"Energy of evaporation, averaged over a grid cell",W/m2,Upward,XYT,,LEday,,Yes,fluxes.csv,lEtot,,
1,1,Qh,hfss,surface_upward_sensible_heat_flux,Sensible heat flux,"Sensible energy, averaged over a grid cell",W/m2,Upward,XYT,,LEday,,Yes,fluxes.csv,Htot,,
1,1,Qg,hfds,surface_downward_heat_flux,Ground heat flux,"Heat flux into the ground, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,fluxes.csv,Gtot,,
1,2,VegT,tcs,surface_canopy_skin_temperature,Vegetation Canopy Temperature,"Vegetation temperature, averaged over all vegetation types",K,-,XYT,,LEday,veg.,Yes,surftemp.csv,Tcave,,
1,2,BaresoilT,tgs,surface_ground_skin_temperature,Temperature of bare soil,Surface bare soil temperature,K,-,XYT,,LEday,baresoil,Yes,surftemp.csv,Tsave,,
2,1,SoilTemp,tsl,soil_temperature,Average layer soil temperature,Average soil temperature in each user-defined soil layer (3D variable),K,-,XYZT,,LEday,,Yes,Sim_Temp.csv,,"If soil layer thicknesses vary from one location to another, interpolate to a standard set of depths. Ideally, the interpolation should preserve the vertical integral.",
1,1,SoilMoist,mrlsl,moisture_content_of_soil_layer,Average layer soil moisture,"Soil water content in each user-defined soil layer (3D variable). Includes the liquid, vapor and solid phases of water in the soil.",kg/m2,-,XYZT,,LWday,,Yes,Sim_Theta.csv,,,
,2,AResist_rac,ares,aerodynamic_resistance,Aerodynamic resistance,,s/m,-,XYT,,LWday,,Yes,aerodyn.csv,rac,,
,2,AResist_ras,ares,aerodynamic_resistance,Aerodynamic resistance,,s/m,-,XYT,,LWday,,Yes,aerodyn.csv,ras,,
,1,RH,hur,relative_humidity,Relative humidity,,%,-,XYT,,LWday,,Yes,ECdata.csv,RH,,
1,1,GPP,gpp,gross_primary_productivity_of_carbon,Gross Primary Production,Carbon Mass Flux out of Atmosphere due to Gross Primary Production on Land,Kg/m2/s,Downward,XYT,,LCmon,,Yes,fluxes.csv,Actot,,
1,1,SWdown_ec,rsds,surface_downwelling_shortwave_flux_in_air,Downward short-wave radiation,,W/m2,Downward,XYT,,L3hr,,,ECdata.csv,Rin,,
1,1,LWdown_ec,rlds,surface_downwelling_longwave_flux_in_air,Downward long-wave radiation,,W/m2,Downward,XYT,,L3hr,,,ECdata.csv,Rli,,
1,1,Qair,huss,specific_humidity,Near surface specific humidity,,kg/kg,-,XYT,,L3hr,,,ECdata.csv,Qair,,
1,1,Tair,ta,air_temperature,Near surface air temperature,,K,-,XYT,,L3hr,,,ECdata.csv,Ta,,
1,1,Psurf,ps,surface_air_pressure,Surface Pressure,,Pa,-,XYT,,L3hr,,,ECdata.csv,p,,
2,1,Wind,ws,wind_speed,Near surface wind speed,,m/s,-,XYT,,L3hr,,,ECdata.csv,u,,
,,Precip,pr,precipitation_flux,Precipitation rate,,kg/m2/s,Downward,XYT,,L3hr,,,ECdata.csv,Pre,,
,,CO2air,co2c,mole_fraction_of_carbon_dioxide_in_air,Near surface CO2 concentration,,-,-,XYT,,L3hr,,,ECdata.csv,CO2air,,
,pri_cmip,short_name_alma,short_name_cmip,standard_name,long_name,definition,unit,direction,dimension,grp_alma,grp_cmip,subgrid,Available in STEMMUS-SCOPE,File name,Variable name in STEMMUS-SCOPE,
1,1,SWnet,rss,surface_net_downward_shortwave_flux,Net shortwave radiation,"Incoming solar radiation less the simulated outgoing shortwave radiation, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Netshort,
1,1,LWnet,rls,surface_net_downward_longwave_flux,Net longwave radiation,"Incident longwave radiation less the simulated outgoing longwave radiation, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Netlong,
,2,SWdown,rsds,surface_downwelling_shortwave_flux_in_air,Downward short-wave radiation,,W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Rin,
,2,LWdown,rlds,surface_downwelling_longwave_flux_in_air,Downward long-wave radiation,,W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Rli,
,2,SWup,rsus,surface_upwelling_shortwave_flux_in_air,Upward short-wave radiation,,W/m2,Upward,XYT,,LEday,,Yes,radiation.csv,HemisOutShort,
2,2,LWup,rlus,surface_upwelling_longwave_flux_in_air,Upward long-wave radiation,This upward longwave flux is to be compared to an ISCCP derived product.,W/m2,Upward,XYT,,LEday,,Yes,radiation.csv,HemisOutLong,
1,1,Qle,hfls,surface_upward_latent_heat_flux,Latent heat flux,"Energy of evaporation, averaged over a grid cell",W/m2,Upward,XYT,,LEday,,Yes,fluxes.csv,lEtot,
1,1,Qh,hfss,surface_upward_sensible_heat_flux,Sensible heat flux,"Sensible energy, averaged over a grid cell",W/m2,Upward,XYT,,LEday,,Yes,fluxes.csv,Htot,
1,1,Qg,hfds,surface_downward_heat_flux,Ground heat flux,"Heat flux into the ground, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,fluxes.csv,Gtot,
1,2,VegT,tcs,surface_canopy_skin_temperature,Vegetation Canopy Temperature,"Vegetation temperature, averaged over all vegetation types",K,-,XYT,,LEday,veg.,Yes,surftemp.csv,Tcave,
1,2,BaresoilT,tgs,surface_ground_skin_temperature,Temperature of bare soil,Surface bare soil temperature,K,-,XYT,,LEday,baresoil,Yes,surftemp.csv,Tsave,
2,1,SoilTemp,tsl,soil_temperature,Average layer soil temperature,Average soil temperature in each user-defined soil layer (3D variable),K,-,XYZT,,LEday,,Yes,Sim_Temp.csv,,"If soil layer thicknesses vary from one location to another, interpolate to a standard set of depths. Ideally, the interpolation should preserve the vertical integral."
1,1,SoilMoist,mrlsl,moisture_content_of_soil_layer,Average layer soil moisture,"Soil water content in each user-defined soil layer (3D variable). Includes the liquid, vapor and solid phases of water in the soil.",kg/m2,-,XYZT,,LWday,,Yes,Sim_Theta.csv,,
,2,AResist_rac,ares,aerodynamic_resistance,Aerodynamic resistance,,s/m,-,XYT,,LWday,,Yes,aerodyn.csv,rac,
,2,AResist_ras,ares,aerodynamic_resistance,Aerodynamic resistance,,s/m,-,XYT,,LWday,,Yes,aerodyn.csv,ras,
,1,RH,hur,relative_humidity,Relative humidity,,%,-,XYT,,LWday,,Yes,ECdata.csv,RH,
1,1,GPP,gpp,gross_primary_productivity_of_carbon,Gross Primary Production,Carbon Mass Flux out of Atmosphere due to Gross Primary Production on Land,kg/m2/s,Downward,XYT,,LCmon,,Yes,fluxes.csv,GPP,
1,1,SWdown_ec,rsds,surface_downwelling_shortwave_flux_in_air,Downward short-wave radiation,,W/m2,Downward,XYT,,L3hr,,,ECdata.csv,Rin,
1,1,LWdown_ec,rlds,surface_downwelling_longwave_flux_in_air,Downward long-wave radiation,,W/m2,Downward,XYT,,L3hr,,,ECdata.csv,Rli,
1,1,Qair,huss,specific_humidity,Near surface specific humidity,,kg/kg,-,XYT,,L3hr,,,ECdata.csv,Qair,
1,1,Tair,ta,air_temperature,Near surface air temperature,,K,-,XYT,,L3hr,,,ECdata.csv,Ta,
1,1,Psurf,ps,surface_air_pressure,Surface Pressure,,Pa,-,XYT,,L3hr,,,ECdata.csv,p,
2,1,Wind,ws,wind_speed,Near surface wind speed,,m/s,-,XYT,,L3hr,,,ECdata.csv,u,
,,Precip,pr,precipitation_flux,Precipitation rate,,kg/m2/s,Downward,XYT,,L3hr,,,ECdata.csv,Pre,
1,1,NEE,nep,surface_net_downward_mass_flux_of_carbon_dioxide_expressed_as_carbon_due_to_all_land_processes_excluding_anthropogenic_land_use_change,Net Ecosystem Exchange,Net Carbon Mass Flux out of Atmophere due to Net Ecosystem Productivity on Land.,kg/m2/s,Downward,XYT,,LCmon,,Yes,fluxes.csv,NEE,
1,1,Rnet,rss,surface_net_radiation_flux,Net radiation,"Net shortwave radiation less the simulated net longwave radiation, averaged over a grid cell",W/m2,Downward,XYT,,LEday,,Yes,radiation.csv,Rntot,
Empty file added utils/csv_to_nc/__init__.py
Empty file.
Loading

0 comments on commit c539978

Please sign in to comment.