-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathREADME.Rmd
171 lines (130 loc) · 7.2 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
---
title: "droughtR"
link-citations: true
bibliography: references.bib
output: github_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE,
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-"
)
```
<!-- badges: start -->
[](https://zenodo.org/doi/10.5281/zenodo.10009276)
[](https://codecov.io/gh/mammask/droughtR)
<img src="https://raw.githubusercontent.com/mammask/droughtR/main/man/figures/droughtR-2.png" align = "right" width = 120/>
The goal of `droughtR` is to enable meteorological drought monitoring by generating non-stationary drought indices under various distributional assumptions (Normal, Gamma, Zero Inflated Gamma and Weibull). It computes the stationary (SPI) and the non-stationary (NSPI) Standardized Precipitation Indices using General Additive Models for Location Scale and Shape (GAMLSS).
<!-- Since drought indices are mainly used in forecasting applications, `droughtR` computes potential biases introduced during the model building process due to incorrect computation of the index. -->
```{r, echo = FALSE}
knitr::opts_chunk$set(
fig.path = "README_figs/README-"
)
```
## Installation
```{r, eval=FALSE}
# Install the development version on Github
devtools::install_github("mammask/droughtR")
```
## Usage
### Generate SPI and NSPI
The function `computenspi` creates stationary and non-stationary meteorological drought indices at different time scales. By default, droughtR uses the gamma distribution:
```{r, eval=TRUE, fig.height=3, fig.width=9, fig.align='center'}
# Load droughtR library
library(droughtR)
# Generate synthetic monthly rainfall data using the Gamma distribution
rain = dummyrainfall(startYear = 1950, endYear = 2010)
# Compute the non-stationary standardized precipitation index (NSPI) for scale 12 using GAMLSS
nonstatdrought = computenspi(x = rain, stationaryspi = FALSE, spiScale = 12, dist = 'gamma')
# Plot NSPI
plot(nonstatdrought)
```
```{r, fig.height=3, fig.width=9, fig.align='center'}
# Compute the stationary standardized precipitation index (NSPI) for scale 12 using GAMLSS and the weibull distribution
statdrought = computenspi(x = rain, stationaryspi = TRUE, spiScale = 12, dist = 'weibull')
# Plot SPI
plot(statdrought)
```
<!-- ## Releases -->
<!-- ## References -->
<!-- # ### Compute the Drought Events -->
<!-- # -->
<!-- # `computeclass` returns the classification of drought events over time: -->
<!-- # -->
<!-- # ```{r, eval=TRUE, fig.height=3, fig.width=9, fig.align='center'} -->
<!-- # # Compute drought class -->
<!-- # indexClass = computeclass(nonstatdrought) -->
<!-- # -->
<!-- # # Plot drought events over time -->
<!-- # plot(indexClass) -->
<!-- # ``` -->
<!-- # -->
<!-- # ### Model-Based Comparison of Drought Indices -->
<!-- # -->
<!-- # Using droughtR, we can compute indices under various distribution assumptions and then compare their fit according to how well they describe the data. Extending the previous example, we can compare the model residuals of the fitted model-based indices: -->
<!-- # -->
<!-- # ```{r, eval=TRUE, fig.align='center'} -->
<!-- # # Plot the model diagnostics of the non-stationary index -->
<!-- # plot(nonstatdrought[['model']]) -->
<!-- # ``` -->
<!-- # ```{r, eval=TRUE, fig.align='center'} -->
<!-- # # Plot the model diagnostics of the stationary index -->
<!-- # plot(statdrought[['model']]) -->
<!-- # ``` -->
<!-- # -->
<!-- # As presented in the diagnostic charts, the Normal Q-Q plot of the GAMLSS model residuals suggest that the non-stationary index under the gamma distribution has a better fit. -->
<!-- # -->
<!-- # In this example, `GAIC()` is used to compare the two model-based drought indices using the AIC: -->
<!-- # -->
<!-- # ```{r, eval = TRUE} -->
<!-- # library(gamlss) -->
<!-- # -->
<!-- # # Compare the two model based implementations using AIC -->
<!-- # GAIC(nonstatdrought[['model']], statdrought[['model']]) -->
<!-- ``` -->
<!-- #### Data Split -->
<!-- The `oossplit` function splits the data into train, validation and test sets: -->
<!-- ```{r, eval=TRUE} -->
<!-- # Split the rainfall series into training validation and test set: -->
<!-- rain = oossplit(x = rain, trainratio = 0.6, validationratio = 0.2, testratio = 0.2) -->
<!-- print(rain) -->
<!-- ``` -->
<!-- #### Bias measurement -->
<!-- When the Standardized Precipitation Index is calculated as part of a forecasting task it introduces biases in the training data. This is mainly observed when the index is computed using the entire data, prior to model validation, and this violates some of the fundamental principles of time series forecasting theory [@mammas2021characterization]. -->
<!-- In this section, the amount of bias introduced to the training data is quantified by measuring the number of miss-classifications when two computational approaches are followed: 1) SPI is computed using the training data only; this is called a "Bias Corrected" computation and 2) SPI is computed using the entire data; this is called a "Bias Induced" computation. -->
<!-- Bias is measured by computing the number of miss-classifications in the training data due to the incorrect computation of the index. -->
<!-- ```{r, eval=TRUE, fig.height=3, fig.width=8, fig.align='center'} -->
<!-- # Generate synthetic monthly rainfall data using the Gamma distribution -->
<!-- rain = dummyrainfall(startYear = 1950, endYear = 2010) -->
<!-- # Compute bias -->
<!-- bias = measurebias(x = rain, trainratio = 0.6, validationratio = 0.2, testratio = 0.2, stationaryspi = TRUE, spiscale = 12, dist = 'normal') -->
<!-- bias -->
<!-- ``` -->
<!-- ### References -->
<!-- #### Bias Corrected auto.arima -->
<!-- In this section, we perform out-of-sample validation using a bias corrected auto.arima to forecast the Standardized Precipitation Index (SPI). An additional parameter is introduced to forecast::auto.arima and requires fitting a S-ARIMA model: -->
<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5} -->
<!-- # out-of-sample validation using a bias corrected auto.arima -->
<!-- model = bcautoarima(x = rain, -->
<!-- trainratio = 0.8, -->
<!-- validationratio = 0.0, -->
<!-- testratio = 0.2, -->
<!-- stationaryspi = TRUE, -->
<!-- spiscale = 12, -->
<!-- seasonal = TRUE) -->
<!-- ``` -->
<!-- The model returns a set of diagnostics and analytical outcomes, including the model description, diagnostics plots and actual vs. predicted forecasts: -->
<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5, echo = TRUE} -->
<!-- # Return the model description -->
<!-- model[['Diagnostics']][['Model Description']] -->
<!-- # Return R2 score in the test set -->
<!-- model[['Diagnostics']][['R2 Score Test']] -->
<!-- ``` -->
<!-- Actual vs. predicted SPI in the test set: -->
<!-- ```{r, eval=TRUE, fig.height=3, fig.width=5, echo = TRUE} -->
<!-- model[['Diagnostics']][['Actual vs Predicted Test']] -->
<!-- ``` -->
<!-- Additional models are developed and can be found here: -->
<!-- * Bias induced auto.arima -->
<!-- * Bias corrected modwt auto.arima -->