README.Rmd

---
title: "droughtR"
link-citations: true
bibliography: references.bib
output: github_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
                      collapse = TRUE,
                      comment = "#>",
                      fig.path = "man/figures/README-"
)
```

<!-- badges: start -->
[![DOI](https://zenodo.org/badge/398032827.svg)](https://zenodo.org/doi/10.5281/zenodo.10009276)

[![codecov](https://codecov.io/gh/mammask/droughtR/graph/badge.svg?token=WYHK9P2NXP)](https://codecov.io/gh/mammask/droughtR)

<img src="https://raw.githubusercontent.com/mammask/droughtR/main/man/figures/droughtR-2.png" align = "right" width = 120/>

The goal of `droughtR` is to enable meteorological drought monitoring by generating non-stationary drought indices under various distributional assumptions (Normal, Gamma, Zero Inflated Gamma and Weibull). It computes the stationary (SPI) and the non-stationary (NSPI) Standardized Precipitation Indices using General Additive Models for Location Scale and Shape (GAMLSS). 

<!-- Since drought indices are mainly used in forecasting applications, `droughtR` computes potential biases introduced during the model building process due to incorrect computation of the index. -->

```{r, echo = FALSE}
knitr::opts_chunk$set(
  fig.path = "README_figs/README-"
)
```

## Installation

```{r, eval=FALSE}
# Install the development version on Github
devtools::install_github("mammask/droughtR")
```

## Usage

### Generate SPI and NSPI

The function `computenspi` creates stationary and non-stationary meteorological drought indices at different time scales. By default, droughtR uses the gamma distribution:

```{r, eval=TRUE, fig.height=3, fig.width=9, fig.align='center'}
# Load droughtR library
library(droughtR)

# Generate synthetic monthly rainfall data using the Gamma distribution
rain = dummyrainfall(startYear = 1950, endYear = 2010)

# Compute the non-stationary standardized precipitation index (NSPI) for scale 12 using GAMLSS
nonstatdrought = computenspi(x = rain, stationaryspi = FALSE, spiScale = 12, dist = 'gamma')

# Plot NSPI
plot(nonstatdrought)
```

```{r, fig.height=3, fig.width=9, fig.align='center'}
# Compute the stationary standardized precipitation index (NSPI) for scale 12 using GAMLSS and the weibull distribution
statdrought = computenspi(x = rain, stationaryspi = TRUE, spiScale = 12, dist = 'weibull')

# Plot SPI 
plot(statdrought)
```

<!-- ## Releases -->

<!-- ## References -->

<!-- # ### Compute the Drought Events -->
<!-- #  -->
<!-- # `computeclass` returns the classification of drought events over time: -->
<!-- #  -->
<!-- # ```{r, eval=TRUE, fig.height=3, fig.width=9, fig.align='center'} -->
<!-- # # Compute drought class -->
<!-- # indexClass = computeclass(nonstatdrought) -->
<!-- #  -->
<!-- # # Plot drought events over time -->
<!-- # plot(indexClass) -->
<!-- # ``` -->
<!-- #  -->
<!-- # ### Model-Based Comparison of Drought Indices -->
<!-- #  -->
<!-- # Using droughtR, we can compute indices under various distribution assumptions and then compare their fit according to how well they describe the data. Extending the previous example, we can compare the model residuals of the fitted model-based indices:  -->
<!-- #  -->
<!-- # ```{r, eval=TRUE, fig.align='center'} -->
<!-- # # Plot the model diagnostics of the non-stationary index  -->
<!-- # plot(nonstatdrought[['model']]) -->
<!-- # ``` -->
<!-- # ```{r, eval=TRUE, fig.align='center'} -->
<!-- # # Plot the model diagnostics of the stationary index  -->
<!-- # plot(statdrought[['model']]) -->
<!-- # ``` -->
<!-- #  -->
<!-- # As presented in the diagnostic charts, the Normal Q-Q plot of the GAMLSS model residuals suggest that the non-stationary index under the gamma distribution has a better fit. -->
<!-- #  -->
<!-- # In this example, `GAIC()` is used to compare the two model-based drought indices using the AIC: -->
<!-- #  -->
<!-- # ```{r, eval = TRUE} -->
<!-- # library(gamlss) -->
<!-- #  -->
<!-- # # Compare the two model based implementations using AIC -->
<!-- # GAIC(nonstatdrought[['model']], statdrought[['model']]) -->
<!-- ``` -->

<!-- #### Data Split -->

<!-- The `oossplit` function splits the data into train, validation and test sets: -->

<!-- ```{r, eval=TRUE} -->
<!-- # Split the rainfall series into training validation and test set: -->
<!-- rain = oossplit(x = rain, trainratio = 0.6, validationratio = 0.2, testratio = 0.2) -->
<!-- print(rain) -->
<!-- ``` -->

<!-- #### Bias measurement -->

<!-- When the Standardized Precipitation Index is calculated as part of a forecasting task it introduces biases in the training data. This is mainly observed when the index is computed using the entire data, prior to model validation, and this violates some of the fundamental principles of time series forecasting theory [@mammas2021characterization]. -->

<!-- In this section, the amount of bias introduced to the training data is quantified by measuring the number of miss-classifications when two computational approaches are followed: 1) SPI is computed using the training data only; this is called a "Bias Corrected" computation and 2) SPI is computed using the entire data; this is called a "Bias Induced" computation.  -->

<!-- Bias is measured by computing the number of miss-classifications in the training data due to the incorrect computation of the index. -->

<!-- ```{r, eval=TRUE, fig.height=3, fig.width=8, fig.align='center'} -->
<!-- # Generate synthetic monthly rainfall data using the Gamma distribution -->
<!-- rain = dummyrainfall(startYear = 1950, endYear = 2010) -->

<!-- # Compute bias -->
<!-- bias = measurebias(x = rain, trainratio = 0.6, validationratio = 0.2, testratio = 0.2, stationaryspi = TRUE, spiscale = 12, dist = 'normal') -->
<!-- bias -->
<!-- ``` -->

<!-- ### References -->

<!-- #### Bias Corrected auto.arima -->

<!-- In this section, we perform out-of-sample validation using a bias corrected auto.arima to forecast the Standardized Precipitation Index (SPI). An additional parameter is introduced to forecast::auto.arima and requires fitting a S-ARIMA model: -->

<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5} -->
<!-- # out-of-sample validation using a bias corrected auto.arima -->
<!-- model = bcautoarima(x = rain, -->
<!--                     trainratio = 0.8, -->
<!--                     validationratio = 0.0, -->
<!--                     testratio = 0.2, -->
<!--                     stationaryspi = TRUE, -->
<!--                     spiscale = 12, -->
<!--                     seasonal = TRUE) -->
<!-- ``` -->

<!-- The model returns a set of diagnostics and analytical outcomes, including the model description, diagnostics plots and actual vs. predicted forecasts: -->

<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5, echo = TRUE} -->
<!-- # Return the model description -->
<!-- model[['Diagnostics']][['Model Description']] -->

<!-- # Return R2 score in the test set -->
<!-- model[['Diagnostics']][['R2 Score Test']] -->
<!-- ``` -->

<!-- Actual vs. predicted SPI in the test set: -->

<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5, echo = TRUE} -->
<!-- model[['Diagnostics']][['Actual vs Predicted Test']] -->
<!-- ``` -->

<!-- Additional models are developed and can be found here: -->

<!-- * Bias induced auto.arima -->
<!-- * Bias corrected modwt auto.arima -->