Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add average notebook #142

Merged
merged 3 commits into from
Mar 5, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Empty file removed .gitlab-ci.yml
Empty file.
309 changes: 309 additions & 0 deletions notebooks/average_over_dims.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
{
"cells": [
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from clisops.utils import get_file\n",
"# fetch files locally or from github\n",
"tas_files = get_file([\n",
" \"cmip5/tas_Amon_HadGEM2-ES_rcp85_r1i1p1_200512-203011.nc\",\n",
" \"cmip5/tas_Amon_HadGEM2-ES_rcp85_r1i1p1_203012-205511.nc\",\n",
" \"cmip5/tas_Amon_HadGEM2-ES_rcp85_r1i1p1_205512-208011.nc\",\n",
"], branch=\"add_cmip5_hadgem\")\n",
"\n",
"o3_file = get_file(\"cmip6/o3_Amon_GFDL-ESM4_historical_r1i1p1f1_gr1_185001-194912.nc\")\n",
"\n",
"# remove previously created example file\n",
"import os\n",
"if os.path.exists(\"./output_001.nc\"):\n",
" os.remove(\"./output_001.nc\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Averaging over dimensions of the dataset\n",
"\n",
"The average over dimensions operation makes use of `clisops.core.average` to process the datasets and to set the output type and the output file names.\n",
"\n",
"It is possible to average over none or any number of time, longitude, latitude or level dimensions in the dataset."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Parameters\n",
"\n",
"Parameters taken by the `average_over_dims` are below:\n",
"\n",
" ds: Union[xr.Dataset, str]\n",
" dims : Optional[Union[Tuple[str], DimensionParameter]]\n",
" The dimensions over which to apply the average. If None, none of the dimensions are averaged over. Dimensions\n",
" must be one of [\"time\", \"level\", \"latitude\", \"longitude\"].\n",
" ignore_undetected_dims: bool\n",
" If the dimensions specified are not found in the dataset, an Exception will be raised if set to True.\n",
" If False, an exception will not be raised and the other dimensions will be averaged over. Default = False\n",
" output_dir: Optional[Union[str, Path]] = None\n",
" output_type: {\"netcdf\", \"nc\", \"zarr\", \"xarray\"}\n",
" split_method: {\"time:auto\"}\n",
" file_namer: {\"standard\", \"simple\"}\n",
" \n",
" \n",
"The output is a list containing the outputs in the format selected. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from clisops.ops.average import average_over_dims\n",
"from roocs_utils.exceptions import InvalidParameterValue\n",
"import xarray as xr"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"ds = xr.open_mfdataset(tas_files, use_cftime=True, combine=\"by_coords\")\n",
"\n",
"ds"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Average over one dimension"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = average_over_dims(ds, dims=[\"time\"], ignore_undetected_dims=False, output_type=\"xarray\")\n",
"\n",
"result[0]\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As you can see in the output dataset, time has been averaged over and has been removed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Average over two dimensions\n",
"\n",
"Averaging over two dimensions is just as simple as averaging over one. The dimensions to be averaged over should be passed in as a sequence."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = average_over_dims(ds, dims=[\"time\", \"latitude\"], ignore_undetected_dims=False, output_type=\"xarray\")\n",
"\n",
"result[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In this case both the time and latitude dimensions have been removed."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Allowed dimensions\n",
"\n",
"It is only possible to average over longtiude, latitude, level and time. If a different dimension is provided to average over an error will be raised."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" average_over_dims(\n",
" ds,\n",
" dims=[\"incorrect_dim\"],\n",
" ignore_undetected_dims=False,\n",
" output_type=\"xarray\",\n",
" )\n",
"except InvalidParameterValue as exc:\n",
" print(exc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Dimensions not found"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the case where a dimension has been selected for averaging but it doesn't exist in the dataset, there are 2 options. \n",
"\n",
"1. To raise an exception when the dimension doesn't exist, set `ignore_undetected_dims = False`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" average_over_dims(\n",
" ds,\n",
" dims=[\"level\", \"time\"],\n",
" ignore_undetected_dims=False,\n",
" output_type=\"xarray\",\n",
" )\n",
"except InvalidParameterValue as exc:\n",
" print(exc)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. To ignore when the dimension doesn't exist, and average over any other requested dimensions anyway, set `ignore_undetected_dims = True`"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = average_over_dims(\n",
" ds,\n",
" dims=[\"level\", \"time\"],\n",
" ignore_undetected_dims=True,\n",
" output_type=\"xarray\",\n",
")\n",
"result[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the case above, a level dimension did not exist, but this was ignored and time was averaged over anyway."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# No dimensions supplied"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"If no dimensions are supplied, no averaging will be applied and the original dataset will be returned."
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"result = average_over_dims(\n",
" ds,\n",
" dims=None,\n",
" ignore_undetected_dims=False,\n",
" output_type=\"xarray\"\n",
")\n",
"\n",
"result[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# An example of averaging over level"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"print(\"Original dataset\")\n",
"print(xr.open_dataset(o3_file, use_cftime=True))\n",
"\n",
"result = average_over_dims(\n",
" o3_file,\n",
" dims=[\"level\"],\n",
" ignore_undetected_dims=False,\n",
" output_type=\"xarray\",\n",
" )\n",
"\n",
"\n",
"print(\"Averaged dataset\")\n",
"result[0]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the above, the dimension `plev` has be removed and averaged over"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
2 changes: 1 addition & 1 deletion notebooks/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@ Examples

subset
core_subset
core_average
average_over_dims
1 change: 0 additions & 1 deletion tests/mini-esgf-data
Submodule mini-esgf-data deleted from 3adba7