-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy path00-tsibble.Rmd
31 lines (19 loc) · 2.65 KB
/
00-tsibble.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
# Tidy time series {#tsibble}
## ts
Time series data structures in R vary substantially, however most time series models make use of the `ts` object structure from the `stats` package. This object concisely stores the time series index using three 'time series parameters' (`tsp`): `start`, `frequency`, and `end`. For most time series tools (such as `arima`, `ets`, `stl`) this structural information is sufficient, however it lacks details that are present in modern time series datasets:
* Multiple seasonality
* Irregular observations
* Exogenous information
* Many time series (that differ in length)
In many senses this structure is limited, and inconsistent with the [tidy data principles](https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html):
* Non-rectangular index structure
* Wide-format for keyed data (`mts`)
* Unnatural index format for importing data
* Difficulty working with tidyverse tools
## tsibble
The [tsibble package](https://github.com/tidyverts/tsibble) by [Earo Wang](https://github.com/earowang) provides a tidy data structure for time series, and is well described in her [introductory vignette](https://pkg.earo.me/tsibble/articles/intro-tsibble.html).
This data structure is sufficiently flexible to support the future of time series modelling tools (such as `tbats`, `fasster` and `prophet`). Beyond the data tidying and transformation tools that the package provides, the object also includes valuable structural information (`index` and `key`) for time series modelling.
### index
The index is essential for modelling as it can be used to identify the frequency and regularity of the observations. By storing a standard `datetime` object within the dataset, it makes irregular time series modelling possible. It also allows a more flexible specification of seasonal frequency (see [seasonal period](#interface)) that is easier to specify for the end user (a very [common difficulty](https://robjhyndman.com/hyndsight/seasonal-periods/) when constructing `ts` objects).
### key
Keys are used within tsibble to uniquely identify related time series in a tidy structure. They are also useful for identifying relational structures between each time series. This is especially useful for [forecast reconciliation](https://otexts.org/fpp2/hierarchical.html), where a hierarchical or grouped structure is imposed on a set of forecasts to impose relational constraints (typically aggregation). Keys within tsibble can be either nested (hierarchical) or crossed (grouped), and can be directly used to reconcile forecasts. This structure also has purpose for univariate models, as it allows [batch forecasting](#advanced) to be applied across many time series.