Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated documentation + added calibration function to triple_collocation module #56

Merged
merged 2 commits into from
Jan 16, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file modified docs/docs_fig_gridder_obs.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions docs/tutorials.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,5 +28,6 @@ e.g.:
tutorials_collocmod
tutorials_validate
tutorials_stats
tutorials_triple_collocation
tutorials_gridder
tutorials_wavyQuick
38 changes: 26 additions & 12 deletions docs/tutorials_gridder.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,25 +10,39 @@ Retrieve satellite observations from multiple satellites:

.. code-block:: python3

>>> from wavy.satmod import satellite_class as sc
>>> from wavy.satellite_module import satellite_class as sc
>>> tmpdir = '/home/patrikb/wavy/tests/data/L3/multi/'
>>> sd = '2005-8-26'
>>> ed = '2005-8-28'
>>> region = 'global'
>>> sd = "2020-11-1"
>>> ed = "2020-11-2"
>>> sco = sc(sdate=sd,edate=ed,region='global')

>>> name = 'multi'
>>> nID = 'CCIv1_L3'
>>> sco = sc(sd=sd,ed=ed,region=region,nID=nID,name=name).populate(path=tmpdir)

Apply the gridder:

.. code-block:: python3

>>> from wavy.gridder import gridder_class as gc
>>> from wavy.gridder_module import gridder_class as gc
>>> from wavy.grid_stats import apply_metric
>>> bb = (-179,179,-80,80) # lonmin,lonmax,latmin,latmax
>>> res = (5,5) # lon/lat
>>> gco = gc(oco=sco,bb=bb,res=res)
>>> var_gridded_dict,lon_grid,lat_grid = apply_metric(gco=gco)
>>> gco.quicklook(val_grid=var_gridded_dict,lon_grid=lon_grid,lat_grid=lat_grid,metric='mor')

>>> bb = (-179, 178, -80, 80) # lonmin,lonmax,latmin,latmax
>>> res = (5, 5) # lon/lat
>>> gco = gc(lons=sco.vars.lons.squeeze().values.ravel(),
... lats=sco.vars.lats.squeeze().values.ravel(),
... values=sco.vars.Hs.squeeze().values.ravel(),
... bb=bb, res=res,
... varalias=sco.varalias,
... units=sco.units,
... sdate=sco.vars.time,
... edate=sco.vars.time)
>>> gridvar, lon_grid, lat_grid = apply_metric(gco=gco)
>>> gco.quicklook(val_grid=gridvar,
... lon_grid=lon_grid,
... lat_grid=lat_grid,
... metric='mor', land_mask_resolution='i',
... mask_metric_llim=1,
... title='')

.. image:: ./docs_fig_gridder_obs.png
:scale: 80

Expand Down
44 changes: 43 additions & 1 deletion docs/tutorials_stats.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,4 +16,46 @@ Validation metrics
* Mean of Observations
* Number of Collocated Values

A more precise description will follow...
Variable names and the computation of Bias, MAD, and RMSD follow the notation of `WMO <https://library.wmo.int/doc_num.php?explnum_id=11599>`_. The correlation coefficient and scatter index follows the convention from `Wilks (2011) <https://doi.org/10.1016/B978-0-12-385022-5.00008-7>`_.

* :math:`x_f` is the forecast values of the chosen parameter
* :math:`x_v` is the observed value to evaluate against ("ground truth")
* n is the number of values to be used for verification/the number of collocated values

Bias or mean deviation (MD) is the difference between the average forecasts and observations. Positive bias indicates that the forecast is too large on average, while negative bias means that the forecast is too small on average. Bias is also known as a systematic error but gives no information about the typical magnitude of the forecast error. It is defined with the same physical dimension as the forecast and observations.

Bias = MD = :math:`\frac{1}{n}\sum_{i=1}^{n}(x_{_f}-x_{_v})_{_i}`

Normalized bias or normalized mean deviation (NMD) is the bias normalized by the sum of the observations. This version of the bias is favorable when the data are related but not strictly comparable, for example, subject to seasonal variations. One downside is that the NMD is unitless.

Normalized bias = NMD = :math:`\frac{\sum_{i=1}^{n}(x_{_f}-x_{_v})_{_i}}{\sum_{i=1}^{n}(x_{_v})_{_i}}`

Mean absolute deviation (MAD) is the arithmetic average of the absolute difference between the members of each pair of forecast and observed quantities. The MAD is zero if the difference between each pair is zero, and increases as discrepancies between the pairs become larger. It has the same units as the forecast and observations.

MAD = :math:`\frac{1}{n}\sum_{i=1}^{n}\lvert{x_{_f}-x_{_v}}\rvert_{_i}`

Root mean squared deviation (RMSD) is the square root of the average squared difference between the forecast and observation pairs. Squaring the difference produces positive terms, increasing from zero to larger values similar to the MAD. However, the RMSD will weigh the large errors more than MAD because the errors are squared. Taking the square root of the difference gives RMSD the same physical dimension as the forecast and observations. The MAD and RMSD can be interpreted as a typical magnitude for the forecast error.

RMSD = :math:`\sqrt{\frac{1}{n}\sum_{i=1}^{n}(x_{_f}-x_{_v})^{2}_{_i}}`

Normalized root mean squared deviation (NRMSD) is the RMSD, normalized by the sum of the observations squared. The NRMSD is introduced because the RMSE does not perform well if comparing models fits for different response variables or if the response variable is modified such as standardized or log-transformed. The disadvantage is that the NRMSD will lose the units associated with the forecast and observations.

NRMSD = :math:`\sqrt{\frac{\sum_{i=1}^{n}(x_{_f}-x_{_v})^{2}_{_i}}{\sum_{i=1}^{n}(x_{_f})^{2}_{_i}}}`

Debiased root mean squared deviation (DRMSD) is the square root of the difference between the mean squared deviation and the squared bias.

DRMSD = :math:`\sqrt{\frac{1}{n}{\sum_{i=1}^{n}(x_{_f}-x_{_v})^{2}_{_i}-\left(\frac{1}{n}\sum_{i=1}^{n}(x_{_f}-x_{_v})_{_i}\right)^2}}`

Pearson product-moment correlation coefficient (r) is a dimensionless, single-value measure of the association between forecast and observation. It is defined as the ratio of the covariance between forecast and observation to the product of the standard deviations of each. If the correlation coefficient is zero, there is no correlation between the variables. A correlation coefficient of 1 implies perfect positive correlation, while -1 implies perfect negative correlation.

:math:`r_{_{f,v}} = \frac{cov(x_{_f},x_{_v})}{std(f)\cdot std(v)}`

The scatter index (SI) measures the magnitude of deviation between the forecast and observations relative to the magnitude of the observations. The smaller SI, the better agreement between forecast and observations. SI can be expressed in terms of the standard deviation of the difference or the root mean squared deviation, normalized by the mean of the observations. The SI is multiplied by a hundred to return the value in percentage.

:math:`SI_{std} = \frac{std(x_{_f} - x_{_v})}{mean(x_{_v})}\cdot100`

:math:`SI_{rmsd} = \frac{rmsd}{mean(x_{_v})}\cdot100`

Model activity ratio (MAR) is the dimensionless ratio of the standard deviation of the observation to the standard deviation of the forecast.

MAR = :math:`\frac{std(x_{_v})}{std(x_{_f})}`
4 changes: 4 additions & 0 deletions docs/tutorials_triple_collocation.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Triple Collocation
##################


26 changes: 26 additions & 0 deletions wavy/triple_collocation.py
Original file line number Diff line number Diff line change
Expand Up @@ -352,6 +352,32 @@ def disp_tc_validation(tc_validate, dec=3):
print("\n The reference for the SI is:", ref)


def calibration(R, A, B):
'''
Calibrate A and B relatively to R using triple collocation calibration
constant estimates, following Gruber et al., 2016 method.

R (list of floats): Reference data to use for calibration.
A, B (lists of floats): Data series to calibrate relatively to the
reference.

returns:
A_R, B_R (lists of floats): Calibrated data series
'''
c_AB = np.cov(A,B)[0,1]
c_RA = np.cov(R,A)[0,1]
c_RB = np.cov(R,B)[0,1]

a_A = c_AB/c_RB
a_B = c_AB/c_RA
a_R = 1

A_R = (a_R/a_A)*(A - np.mean(A)) + np.mean(R)
B_R = (a_R/a_B)*(B - np.mean(B)) + np.mean(R)

return A_R, B_R


def bootstrap_ci(result_dict,
conf=95.,
sample_size=None,
Expand Down
Loading