A small Python library providing functions to copy easily a netcdf file while replacing and transforming variables and dimensions.
ncio
is a Python library that provides functions to copy
dimensions, variables, attributes, etc. from one netcdf file to a
netcdf output file. Replacements and transformations can be performed
on the copied entities such as names can be changed or variables can
be replaced with other data. It is a thin wrapper of the netCDF4
Python package that adds no new functionality but rather provides
convenience functions to work easily with netcdf files.
The complete documentation for ncio
is available at Github
pages:
https://mcuntz.github.io/ncio/
ncio
provides the functions ncio.copy_dimensions
,
ncio.copy_global_attributes
, and ncio.copy_variables
to copy
dimensions, attributes, and variables from one netcdf file to
another. The functions have the keywords renamedim, removedim,
changedim, adddim, renamevar, removevar, replacevar, and
replaceatt, which purpose should be rather mnemonic. There is also
the function ncio.copy_file
that combines the three individual
routines in a single function.
Imagine you have a netcdf file that contains the variable gpp, among others. You want to have an output file, which is the same as the input file except that gpp should be doubled. This could be done, for example, as:
import ncio
ifile = 'basic_gpp.nc'
ofile = 'gppx2.nc'
# read current 'gpp' array
gpp = ncio.ncread(ifile, 'gpp')
# copy file by replacing gpp by double its value
ncio.copy_file(ifile, ofile, replacevar={'gpp': {'gpp': 2. * gpp}})
Here the convenience function ncio.ncread
was used that simply reads
the given variable from a netcdf file.
There is also the convenience function ncio.ncinfo
, where one can
inquire about dimensions in the file, variables, their
long_names and units or all attributes. One could, for example,
assert that the variable gpp is present before reading it:
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
If you set the keyword noclose=True in ncio.copy_file
, then it
does not close the file and returns the file handle. You can then
still manipulate the contents of the file. The doubling of gpp could
hence also be done as:
import ncio
ifile = 'basic_gpp.nc'
ofile = 'gppx2.nc'
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
# copy file, then directly access gpp variable in putput file
fo = ncio.copy_file(ifile, ofile, noclose=True)
ovar = fo.variables['gpp']
ovar[:] = ovar[:] * 2.
fo.close()
A slightly extended example could be to produce a suite of scenarios with respiration (resp) having global annual values between say 120 and 170 and the spatial pattern of gpp. If the input file has monthly global gpp fields for several years, this would be:
import ncio
ifile = 'basic_gpp.nc'
# read current 'gpp' array
assert 'gpp' in ncio.ncinfo(ifile, variables=True), (
f'Variable "gpp" not in input file {ifile}')
gpp = ncio.ncread(ifile, 'gpp')
isglobalgpp = gpp.mean() * 12.
for g in [140, 150, 160, 170]:
ofile = f'resp{g}.nc'
hist = (f'Modified {ifile} to produce resp with the distribution'
f' of gpp but with an annual mean of {g} PgC/a.')
ncio.copy_file(ifile, ofile,
replacevar={'gpp': {'resp': gpp * (g / isglobalgpp)}},
replaceatt={'resp':
{'long_name':
'Terrestrial Ecosystem Respiration',
'units': 'kg gridcell-1 s-1'}},
addglobalatt={'history': hist})
Using the individual functions gives more flexibility, of course. Imagine you have an input file that has monthly fields on a global grid on Earth, i.e. having dimensions (time, lon, lat). The land grid cells of the vegetation variables have the extra dimension patch for fractions of different plant functional types in the same grid cell, i.e. having dimensions (time, lon, lat, patch) or (time, patch, lon, lat). There is an associated variable patchfrac that gives the fixed relative sizes of each patch. To get the average of the grid cell, the vegetation variables have to be multiplied by patchfrac and summed over the patch dimension. We will first set all metadata in the output file, then copy the variables that have no unlimited dimension time, and then the variables having a time dimensions. The latter will be copied timestep per timestep to avoid Python swapping variables to disk or out-of-memory:
import sys
import time
import netCDF4 as nc
import ncio
ifile = 'patch.nc'
ofile = 'nopatch.nc'
# check input file
assert 'patchfrac' in ncio.ncinfo(ifile, variables=True), (
f'No patchfrac variable in input file {ifile}')
assert 'patch' in ncio.ncinfo(ifile, var='patchfrac', dims=True)
# open input and output files
fi = nc.Dataset(ifile, 'r')
if 'file_format' in dir(fi):
fo = nc.Dataset(ofile, 'w', format=fi.file_format)
else:
fo = nc.Dataset(ofile, 'w', format='NETCDF3_64BIT_OFFSET')
# copy global attributes, adding script
ncio.copy_global_attributes(fi, fo,
add={'history': (
time.asctime() + ': ' +
' '.join(sys.argv))})
# copy dimensions
ncio.copy_dimensions(fi, fo, removedim=['patch'])
# create static variables (independent of time)
ncio.create_variables(fi, fo, time=False, removedim=['patch'])
# create dynamic variables (time dependent)
ncio.create_variables(fi, fo, time=True, removedim=['patch'])
# get patchfrac
patchfrac = fi.variables['patchfrac']
# copy static variables
for ivar in fi.variables.values():
if 'time' not in ivar.dimensions:
ovar = fo.variables[ivar.name]
invar = ivar[:]
if 'patch' in ivar.dimensions:
idx = ivar.dimensions.index('patch')
# use patchfrac at first time step for non-time
# dependent variables
out = np.sum(invar * patchfrac, axis=idx)
else:
out = invar
ovar[:] = out
# copy dynamic variables
ntime = fi.dimensions['time'].size
for tt in range(ntime):
for ivar in fi.variables.values():
if 'time' in ivar.dimensions:
ovar = fo.variables[ivar.name]
invar = ivar[tt, ...]
if 'patch' in ivar.dimensions:
# -1 because of specific timestep
idx = ivar.dimensions.index('patch') - 1
out = np.sum(invar * patchfrac, axis=idx)
else:
out = invar
ovar[tt, ...] = out
# finish
fi.close()
fo.close()
The easiest way to install is via pip:
pip install ncio
or via conda:
conda install -c conda-forge ncio
ncio
is distributed under the MIT License. See the LICENSE file
for details.
Copyright (c) 2020- Matthias Cuntz