diff --git a/dev/articles/scoring-rules.html b/dev/articles/scoring-rules.html index 65f28097..ecd44eb9 100644 --- a/dev/articles/scoring-rules.html +++ b/dev/articles/scoring-rules.html @@ -63,7 +63,7 @@

Scoring rules in `scoringutils`

Nikos Bosse

-

2024-10-30

+

2024-10-31

Source: vignettes/scoring-rules.Rmd
scoring-rules.Rmd
diff --git a/dev/index.html b/dev/index.html index 1072e933..7ac6642d 100644 --- a/dev/index.html +++ b/dev/index.html @@ -91,7 +91,7 @@

Forecast typesInput formats and input validation

The expected input format is generally a data.frame (or similar) with required columns observed, predicted, and model that holds the forecasts and observed values. Exact requirements depend on the forecast type. For more information, have a look at the paper, call ?as_forecast(), or have a look at the example data provided in the package (example_binary, example_point, example_quantile, example_sample_continuous, example_sample_discrete).

-

Before scoring, input data needs to be validated and transformed into a forecast object using the function as_forecast().

+

Before scoring, input data needs to be validated and transformed into a forecast object using the function as_forecast().

 forecast_quantile <- example_quantile |>
   as_forecast_quantile(
diff --git a/dev/pkgdown.yml b/dev/pkgdown.yml
index e048f8a2..42dd57ab 100644
--- a/dev/pkgdown.yml
+++ b/dev/pkgdown.yml
@@ -5,7 +5,7 @@ articles:
   Deprecated-functions: Deprecated-functions.html
   Deprecated-visualisations: Deprecated-visualisations.html
   scoring-rules: scoring-rules.html
-last_built: 2024-10-30T21:32Z
+last_built: 2024-10-31T02:01Z
 urls:
   reference: https://epiforecasts.io/scoringutils/reference
   article: https://epiforecasts.io/scoringutils/articles
diff --git a/dev/reference/apply_metrics.html b/dev/reference/apply_metrics.html
index 002b7407..21a8f990 100644
--- a/dev/reference/apply_metrics.html
+++ b/dev/reference/apply_metrics.html
@@ -69,7 +69,7 @@ 

Argumentsforecast

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values).

metrics
diff --git a/dev/reference/as_forecast_binary.html b/dev/reference/as_forecast_binary.html index c3d3be7b..67a60f43 100644 --- a/dev/reference/as_forecast_binary.html +++ b/dev/reference/as_forecast_binary.html @@ -1,7 +1,21 @@ -Create a forecast object for binary forecasts — as_forecast_binary • scoringutils +Create a forecast object for binary forecasts — as_forecast_binary • scoringutils Skip to contents @@ -42,8 +56,15 @@

Create a forecast object for binary forecasts

-

Create a forecast object for binary forecasts. See more information on -forecast types and expected input formats by calling ?as_forecast().

+

Process and validate a data.frame (or similar) or similar with forecasts +and observations. If the input passes all input checks, those functions will +be converted to a forecast object. A forecast object is a data.table with +a class forecast and an additional class that depends on the forecast type.

+

The arguments observed, predicted, etc. make it possible to rename +existing columns of the input data to match the required columns for a +forecast object. Using the argument forecast_unit, you can specify +the columns that uniquely identify a single forecast (and thereby removing +other, unneeded columns. See section "Forecast Unit" below for details).

@@ -62,8 +83,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

forecast_unit
@@ -85,16 +106,99 @@

Arguments

+
+

Value

+

A forecast object of class forecast_binary

+
+
+

Required input

+

The input needs to be a data.frame or similar with the following columns:

For convenience, we recommend an additional column model holding the name +of the forecaster or model that produced a prediction, but this is not +strictly necessary.

+

See the example_binary data set for an example.

+
+
+

Forecast unit

+

In order to score forecasts, scoringutils needs to know which of the rows +of the data belong together and jointly form a single forecasts. This is +easy e.g. for point forecast, where there is one row per forecast. For +quantile or sample-based forecasts, however, there are multiple rows that +belong to a single forecast.

+

The forecast unit or unit of a single forecast is then described by the +combination of columns that uniquely identify a single forecast. +For example, we could have forecasts made by different models in various +locations at different time points, each for several weeks into the future. +The forecast unit could then be described as +forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). +scoringutils automatically tries to determine the unit of a single +forecast. It uses all existing columns for this, which means that no columns +must be present that are unrelated to the forecast unit. As a very simplistic +example, if you had an additional row, "even", that is one if the row number +is even and zero otherwise, then this would mess up scoring as scoringutils +then thinks that this column was relevant in defining the forecast unit.

+

In order to avoid issues, we recommend setting the forecast unit explicitly, +using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

+

See also

Other functions to create forecast objects: -as_forecast, as_forecast_nominal(), as_forecast_point(), as_forecast_quantile(), as_forecast_sample()

+
+

Examples

+
as_forecast_binary(
+  example_binary,
+  predicted = "predicted",
+  forecast_unit = c("model", "target_type", "target_end_date",
+                    "horizon", "location")
+)
+#>  Some rows containing NA values may be removed. This is fine if not
+#>   unexpected.
+#> Forecast type: binary
+#> Forecast unit:
+#> model, target_type, target_end_date, horizon, and location
+#> 
+#>       predicted observed                 model target_type target_end_date
+#>           <num>   <fctr>                <char>      <char>          <Date>
+#>    1:        NA     <NA>                  <NA>       Cases      2021-01-02
+#>    2:        NA     <NA>                  <NA>      Deaths      2021-01-02
+#>    3:        NA     <NA>                  <NA>       Cases      2021-01-09
+#>    4:        NA     <NA>                  <NA>      Deaths      2021-01-09
+#>    5:        NA     <NA>                  <NA>       Cases      2021-01-16
+#>   ---                                                                     
+#> 1027:     0.250        0 EuroCOVIDhub-baseline      Deaths      2021-07-24
+#> 1028:     0.475        0       UMass-MechBayes      Deaths      2021-07-24
+#> 1029:     0.450        0       UMass-MechBayes      Deaths      2021-07-24
+#> 1030:     0.375        0  epiforecasts-EpiNow2      Deaths      2021-07-24
+#> 1031:     0.300        0  epiforecasts-EpiNow2      Deaths      2021-07-24
+#>       horizon location
+#>         <num>   <char>
+#>    1:      NA       DE
+#>    2:      NA       DE
+#>    3:      NA       DE
+#>    4:      NA       DE
+#>    5:      NA       DE
+#>   ---                 
+#> 1027:       2       IT
+#> 1028:       3       IT
+#> 1029:       2       IT
+#> 1030:       3       IT
+#> 1031:       2       IT
+
+
diff --git a/dev/reference/as_forecast_doc_template.html b/dev/reference/as_forecast_doc_template.html new file mode 100644 index 00000000..c46795ea --- /dev/null +++ b/dev/reference/as_forecast_doc_template.html @@ -0,0 +1,144 @@ + +General information on creating a forecast object — as_forecast_doc_template • scoringutils + Skip to contents + + +
+
+
+ +
+

Process and validate a data.frame (or similar) or similar with forecasts +and observations. If the input passes all input checks, those functions will +be converted to a forecast object. A forecast object is a data.table with +a class forecast and an additional class that depends on the forecast type.

+

The arguments observed, predicted, etc. make it possible to rename +existing columns of the input data to match the required columns for a +forecast object. Using the argument forecast_unit, you can specify +the columns that uniquely identify a single forecast (and thereby removing +other, unneeded columns. See section "Forecast Unit" below for details).

+
+ + +
+

Arguments

+ + +
data
+

A data.frame (or similar) with predicted and observed values. +See the details section of for additional information +on the required input format.

+ + +
forecast_unit
+

(optional) Name of the columns in data (after +any renaming of columns) that denote the unit of a +single forecast. See get_forecast_unit() for details. +If NULL (the default), all columns that are not required columns are +assumed to form the unit of a single forecast. If specified, all columns +that are not part of the forecast unit (or required columns) will be removed.

+ + +
observed
+

(optional) Name of the column in data that contains the +observed values. This column will be renamed to "observed".

+ + +
predicted
+

(optional) Name of the column in data that contains the +predicted values. This column will be renamed to "predicted".

+ +
+
+

Forecast unit

+

In order to score forecasts, scoringutils needs to know which of the rows +of the data belong together and jointly form a single forecasts. This is +easy e.g. for point forecast, where there is one row per forecast. For +quantile or sample-based forecasts, however, there are multiple rows that +belong to a single forecast.

+

The forecast unit or unit of a single forecast is then described by the +combination of columns that uniquely identify a single forecast. +For example, we could have forecasts made by different models in various +locations at different time points, each for several weeks into the future. +The forecast unit could then be described as +forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). +scoringutils automatically tries to determine the unit of a single +forecast. It uses all existing columns for this, which means that no columns +must be present that are unrelated to the forecast unit. As a very simplistic +example, if you had an additional row, "even", that is one if the row number +is even and zero otherwise, then this would mess up scoring as scoringutils +then thinks that this column was relevant in defining the forecast unit.

+

In order to avoid issues, we recommend setting the forecast unit explicitly, +using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

+
+ +
+ + +
+ + + + + + + diff --git a/dev/reference/as_forecast_generic.html b/dev/reference/as_forecast_generic.html index dd9d3ade..f4382e19 100644 --- a/dev/reference/as_forecast_generic.html +++ b/dev/reference/as_forecast_generic.html @@ -59,8 +59,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

forecast_unit
diff --git a/dev/reference/as_forecast_nominal.html b/dev/reference/as_forecast_nominal.html index d528766c..d3ec9615 100644 --- a/dev/reference/as_forecast_nominal.html +++ b/dev/reference/as_forecast_nominal.html @@ -1,9 +1,21 @@ -Create a forecast object for nominal forecasts — as_forecast_nominal • scoringutils +Create a forecast object for nominal forecasts — as_forecast_nominal • scoringutils Skip to contents @@ -44,9 +56,15 @@

Create a forecast object for nominal forecasts

-

Nominal forecasts are a form of categorical forecasts where the possible -outcomes that the observed values can assume are not ordered. In that sense, -Nominal forecasts represent a generalisation of binary forecasts.

+

Process and validate a data.frame (or similar) or similar with forecasts +and observations. If the input passes all input checks, those functions will +be converted to a forecast object. A forecast object is a data.table with +a class forecast and an additional class that depends on the forecast type.

+

The arguments observed, predicted, etc. make it possible to rename +existing columns of the input data to match the required columns for a +forecast object. Using the argument forecast_unit, you can specify +the columns that uniquely identify a single forecast (and thereby removing +other, unneeded columns. See section "Forecast Unit" below for details).

@@ -66,8 +84,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

forecast_unit
@@ -92,20 +110,115 @@

Argumentspredicted_label

(optional) Name of the column in data that denotes the outcome to which a predicted probability corresponds to. -This column will be renamed to "predicted_label". Only applicable to -nominal forecasts.

+This column will be renamed to "predicted_label".

+
+

Value

+

A forecast object of class forecast_nominal

+
+
+

Details

+

Nominal forecasts are a form of categorical forecasts and represent a +generalisation of binary forecasts to multiple outcomes. The possible +outcomes that the observed values can assume are not ordered.

+
+
+

Required input

+

The input needs to be a data.frame or similar with the following columns:

For convenience, we recommend an additional column model holding the name +of the forecaster or model that produced a prediction, but this is not +strictly necessary.

+

See the example_nominal data set for an example.

+
+
+

Forecast unit

+

In order to score forecasts, scoringutils needs to know which of the rows +of the data belong together and jointly form a single forecasts. This is +easy e.g. for point forecast, where there is one row per forecast. For +quantile or sample-based forecasts, however, there are multiple rows that +belong to a single forecast.

+

The forecast unit or unit of a single forecast is then described by the +combination of columns that uniquely identify a single forecast. +For example, we could have forecasts made by different models in various +locations at different time points, each for several weeks into the future. +The forecast unit could then be described as +forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). +scoringutils automatically tries to determine the unit of a single +forecast. It uses all existing columns for this, which means that no columns +must be present that are unrelated to the forecast unit. As a very simplistic +example, if you had an additional row, "even", that is one if the row number +is even and zero otherwise, then this would mess up scoring as scoringutils +then thinks that this column was relevant in defining the forecast unit.

+

In order to avoid issues, we recommend setting the forecast unit explicitly, +using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

+

See also

Other functions to create forecast objects: -as_forecast, as_forecast_binary(), as_forecast_point(), as_forecast_quantile(), as_forecast_sample()

+
+

Examples

+
as_forecast_nominal(
+  na.omit(example_nominal),
+  predicted = "predicted",
+  forecast_unit = c("model", "target_type", "target_end_date",
+                    "horizon", "location")
+)
+#> Forecast type: nominal
+#> Forecast unit:
+#> model, target_type, target_end_date, horizon, and location
+#> 
+#> Warning: ! Error in validating forecast object: Error in assert_forecast(forecast = out,
+#>   verbose = FALSE) : ! Found incomplete forecasts  For a nominal forecast, all
+#>   possible outcomes must be assigned a probability explicitly.  Found first
+#>   missing probabilities in the forecast identified by model == NA, target_type
+#>   == NA, target_end_date == NA, horizon == NA, and location == NA
+#>       observed predicted_label predicted                 model target_type
+#>         <fctr>          <fctr>     <num>                <char>      <char>
+#>    1:      low             low     0.525 EuroCOVIDhub-ensemble       Cases
+#>    2:      low             low     0.075 EuroCOVIDhub-baseline       Cases
+#>    3:      low             low     0.150  epiforecasts-EpiNow2       Cases
+#>    4:   medium             low     0.100 EuroCOVIDhub-ensemble      Deaths
+#>    5:   medium             low     0.275 EuroCOVIDhub-baseline      Deaths
+#>   ---                                                                     
+#> 2657:      low          medium     0.300 EuroCOVIDhub-baseline      Deaths
+#> 2658:   medium          medium     0.850       UMass-MechBayes      Deaths
+#> 2659:      low          medium     0.825       UMass-MechBayes      Deaths
+#> 2660:   medium          medium     0.275  epiforecasts-EpiNow2      Deaths
+#> 2661:      low          medium     0.375  epiforecasts-EpiNow2      Deaths
+#>       target_end_date horizon location
+#>                <Date>   <num>   <char>
+#>    1:      2021-05-08       1       DE
+#>    2:      2021-05-08       1       DE
+#>    3:      2021-05-08       1       DE
+#>    4:      2021-05-08       1       DE
+#>    5:      2021-05-08       1       DE
+#>   ---                                 
+#> 2657:      2021-07-24       2       IT
+#> 2658:      2021-07-24       3       IT
+#> 2659:      2021-07-24       2       IT
+#> 2660:      2021-07-24       3       IT
+#> 2661:      2021-07-24       2       IT
+
+
diff --git a/dev/reference/as_forecast_point.html b/dev/reference/as_forecast_point.html index 4ffaeca3..42dd5490 100644 --- a/dev/reference/as_forecast_point.html +++ b/dev/reference/as_forecast_point.html @@ -1,10 +1,6 @@ -Create a forecast object for point forecasts — as_forecast_point • scoringutilsCreate a forecast object for point forecasts — as_forecast_point • scoringutils Skip to contents @@ -46,9 +42,7 @@

Create a forecast object for point forecasts

-

Create a forecast object for point forecasts. See more information on -forecast types and expected input formats by calling ?as_forecast().

-

When converting a forecast_quantile object into a forecast_point object, +

When converting a forecast_quantile object into a forecast_point object, the 0.5 quantile is extracted and returned as the point forecast.

@@ -75,8 +69,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

...
@@ -102,10 +96,22 @@

Arguments +
+

Value

+

A forecast object of class forecast_point

+
+
+

Required input

+

The input needs to be a data.frame or similar with the following columns:

  • observed: Column of type numeric with observed values.

  • +
  • predicted: Column of type numeric with predicted values.

  • +

For convenience, we recommend an additional column model holding the name +of the forecaster or model that produced a prediction, but this is not +strictly necessary.

+

See the example_point data set for an example.

+

See also

Other functions to create forecast objects: -as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_quantile(), diff --git a/dev/reference/as_forecast_quantile.html b/dev/reference/as_forecast_quantile.html index d7be578d..4c8b97bd 100644 --- a/dev/reference/as_forecast_quantile.html +++ b/dev/reference/as_forecast_quantile.html @@ -1,17 +1,21 @@ -Create a forecast object for quantile-based forecasts — as_forecast_quantile • scoringutils +Create a forecast object for quantile-based forecasts — as_forecast_quantile • scoringutils Skip to contents @@ -52,13 +56,15 @@

Create a forecast object for quantile-based forecasts

-

Create a forecast object for quantile-based forecasts. See more information -on forecast types and expected input formats by calling ?as_forecast().

-

When creating a forecast_quantile object from a forecast_sample object, -the quantiles are estimated by computing empircal quantiles from the samples -via quantile(). Note that empirical quantiles are a biased estimator for -the true quantiles in particular in the tails of the distribution and -when the number of available samples is low.

+

Process and validate a data.frame (or similar) or similar with forecasts +and observations. If the input passes all input checks, those functions will +be converted to a forecast object. A forecast object is a data.table with +a class forecast and an additional class that depends on the forecast type.

+

The arguments observed, predicted, etc. make it possible to rename +existing columns of the input data to match the required columns for a +forecast object. Using the argument forecast_unit, you can specify +the columns that uniquely identify a single forecast (and thereby removing +other, unneeded columns. See section "Forecast Unit" below for details).

@@ -90,8 +96,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

...
@@ -134,16 +140,107 @@

Argumentsquantile().

+
+

Value

+

A forecast object of class forecast_quantile

+
+
+

Required input

+

The input needs to be a data.frame or similar with the following columns:

  • observed: Column of type numeric with observed values.

  • +
  • predicted: Column of type numeric with predicted values. Predicted +values represent quantiles of the predictive distribution.

  • +
  • quantile_level: Column of type numeric, denoting the quantile level of +the corresponding predicted value. +Quantile levels must be between 0 and 1.

  • +

For convenience, we recommend an additional column model holding the name +of the forecaster or model that produced a prediction, but this is not +strictly necessary.

+

See the example_quantile data set for an example.

+
+
+

Converting from forecast_sample to forecast_quantile

+

When creating a forecast_quantile object from a forecast_sample object, +the quantiles are estimated by computing empircal quantiles from the samples +via quantile(). Note that empirical quantiles are a biased estimator for +the true quantiles in particular in the tails of the distribution and +when the number of available samples is low.

+
+
+

Forecast unit

+

In order to score forecasts, scoringutils needs to know which of the rows +of the data belong together and jointly form a single forecasts. This is +easy e.g. for point forecast, where there is one row per forecast. For +quantile or sample-based forecasts, however, there are multiple rows that +belong to a single forecast.

+

The forecast unit or unit of a single forecast is then described by the +combination of columns that uniquely identify a single forecast. +For example, we could have forecasts made by different models in various +locations at different time points, each for several weeks into the future. +The forecast unit could then be described as +forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). +scoringutils automatically tries to determine the unit of a single +forecast. It uses all existing columns for this, which means that no columns +must be present that are unrelated to the forecast unit. As a very simplistic +example, if you had an additional row, "even", that is one if the row number +is even and zero otherwise, then this would mess up scoring as scoringutils +then thinks that this column was relevant in defining the forecast unit.

+

In order to avoid issues, we recommend setting the forecast unit explicitly, +using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

+

See also

Other functions to create forecast objects: -as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_point(), as_forecast_sample()

+
+

Examples

+
as_forecast_quantile(
+  example_quantile,
+  predicted = "predicted",
+  forecast_unit = c("model", "target_type", "target_end_date",
+                    "horizon", "location")
+)
+#>  Some rows containing NA values may be removed. This is fine if not
+#>   unexpected.
+#> Forecast type: quantile
+#> Forecast unit:
+#> model, target_type, target_end_date, horizon, and location
+#> 
+#> Key: <location, target_end_date, target_type>
+#>        observed quantile_level predicted                model target_type
+#>           <num>          <num>     <int>               <char>      <char>
+#>     1:   127300             NA        NA                 <NA>       Cases
+#>     2:     4534             NA        NA                 <NA>      Deaths
+#>     3:   154922             NA        NA                 <NA>       Cases
+#>     4:     6117             NA        NA                 <NA>      Deaths
+#>     5:   110183             NA        NA                 <NA>       Cases
+#>    ---                                                                   
+#> 20541:       78          0.850       352 epiforecasts-EpiNow2      Deaths
+#> 20542:       78          0.900       397 epiforecasts-EpiNow2      Deaths
+#> 20543:       78          0.950       499 epiforecasts-EpiNow2      Deaths
+#> 20544:       78          0.975       611 epiforecasts-EpiNow2      Deaths
+#> 20545:       78          0.990       719 epiforecasts-EpiNow2      Deaths
+#>        target_end_date horizon location
+#>                 <Date>   <num>   <char>
+#>     1:      2021-01-02      NA       DE
+#>     2:      2021-01-02      NA       DE
+#>     3:      2021-01-09      NA       DE
+#>     4:      2021-01-09      NA       DE
+#>     5:      2021-01-16      NA       DE
+#>    ---                                 
+#> 20541:      2021-07-24       2       IT
+#> 20542:      2021-07-24       2       IT
+#> 20543:      2021-07-24       2       IT
+#> 20544:      2021-07-24       2       IT
+#> 20545:      2021-07-24       2       IT
+
+
diff --git a/dev/reference/as_forecast_sample.html b/dev/reference/as_forecast_sample.html index 4deae786..65ed25a7 100644 --- a/dev/reference/as_forecast_sample.html +++ b/dev/reference/as_forecast_sample.html @@ -1,5 +1,21 @@ -Create a forecast object for sample-based forecasts — as_forecast_sample • scoringutils +Create a forecast object for sample-based forecasts — as_forecast_sample • scoringutils Skip to contents @@ -40,7 +56,15 @@

Create a forecast object for sample-based forecasts

-

Create a forecast object for sample-based forecasts

+

Process and validate a data.frame (or similar) or similar with forecasts +and observations. If the input passes all input checks, those functions will +be converted to a forecast object. A forecast object is a data.table with +a class forecast and an additional class that depends on the forecast type.

+

The arguments observed, predicted, etc. make it possible to rename +existing columns of the input data to match the required columns for a +forecast object. Using the argument forecast_unit, you can specify +the columns that uniquely identify a single forecast (and thereby removing +other, unneeded columns. See section "Forecast Unit" below for details).

@@ -60,8 +84,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

forecast_unit
@@ -85,14 +109,53 @@

Argumentssample_id

(optional) Name of the column in data that contains the -sample id. This column will be renamed to "sample_id". Only applicable to -sample-based forecasts.

+sample id. This column will be renamed to "sample_id".

+
+

Value

+

A forecast object of class forecast_sample

+
+
+

Required input

+

The input needs to be a data.frame or similar with the following columns:

For convenience, we recommend an additional column model holding the name +of the forecaster or model that produced a prediction, but this is not +strictly necessary.

+

See the example_sample_continuous and example_sample_discrete data set +for an example

+
+
+

Forecast unit

+

In order to score forecasts, scoringutils needs to know which of the rows +of the data belong together and jointly form a single forecasts. This is +easy e.g. for point forecast, where there is one row per forecast. For +quantile or sample-based forecasts, however, there are multiple rows that +belong to a single forecast.

+

The forecast unit or unit of a single forecast is then described by the +combination of columns that uniquely identify a single forecast. +For example, we could have forecasts made by different models in various +locations at different time points, each for several weeks into the future. +The forecast unit could then be described as +forecast_unit = c("model", "location", "forecast_date", "forecast_horizon"). +scoringutils automatically tries to determine the unit of a single +forecast. It uses all existing columns for this, which means that no columns +must be present that are unrelated to the forecast unit. As a very simplistic +example, if you had an additional row, "even", that is one if the row number +is even and zero otherwise, then this would mess up scoring as scoringutils +then thinks that this column was relevant in defining the forecast unit.

+

In order to avoid issues, we recommend setting the forecast unit explicitly, +using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

+

See also

Other functions to create forecast objects: -as_forecast, as_forecast_binary(), as_forecast_nominal(), as_forecast_point(), diff --git a/dev/reference/assert_forecast.html b/dev/reference/assert_forecast.html index c037e7d8..c56cdba8 100644 --- a/dev/reference/assert_forecast.html +++ b/dev/reference/assert_forecast.html @@ -1,9 +1,13 @@ Assert that input is a forecast object and passes validations — assert_forecast.forecast_binary • scoringutils +forecast and an additional class forecast_&lt;type&gt; corresponding to the +forecast type). +See the corresponding assert_forecast_&lt;type&gt; functions for more details on +the required input formats."> Skip to contents @@ -45,8 +49,10 @@

Assert that input is a forecast object and passes validations

Assert that an object is a forecast object (i.e. a data.table with a class -forecast and an additional class forecast_* corresponding to the forecast -type).

+forecast and an additional class forecast_<type> corresponding to the +forecast type).

+

See the corresponding assert_forecast_<type> functions for more details on +the required input formats.

@@ -75,7 +81,7 @@

Argumentsforecast

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values).

forecast_type
@@ -101,45 +107,6 @@

ArgumentsValue

Returns NULL invisibly.

-
-

Forecast types and input formats

-

Various different forecast types / forecast formats are supported. At the -moment, those are:

  • point forecasts

  • -
  • binary forecasts ("soft binary classification")

  • -
  • nominal forecasts ("soft classification with multiple unordered classes")

  • -
  • Probabilistic forecasts in a quantile-based format (a forecast is -represented as a set of predictive quantiles)

  • -
  • Probabilistic forecasts in a sample-based format (a forecast is represented -as a set of predictive samples)

  • -

Forecast types are determined based on the columns present in the input data. -Here is an overview of the required format for each forecast type: - -

-
-

All forecast types require a data.frame or similar with columns observed -predicted, and model.

-

Point forecasts require a column observed of type numeric and a column -predicted of type numeric.

-

Binary forecasts require a column observed of type factor with exactly -two levels and a column predicted of type numeric with probabilities, -corresponding to the probability that observed is equal to the second -factor level. See details here for more information.

-

Nominal forecasts require a column observed of type factor with N levels, -(where N is the number of possible outcomes), a column predicted of type -numeric with probabilities (which sum to one across all possible outcomes), -and a column predicted_label of type factor with N levels, denoting the -outcome for which a probability is given. Forecasts must be complete, i.e. -there must be a probability assigned to every possible outcome.

-

Quantile-based forecasts require a column observed of type numeric, -a column predicted of type numeric, and a column quantile_level of type -numeric with quantile-levels (between 0 and 1).

-

Sample-based forecasts require a column observed of type numeric, -a column predicted of type numeric, and a column sample_id of type -numeric with sample indices.

-

For more information see the vignettes and the example data -(example_quantile, example_sample_continuous, example_sample_discrete, -example_point(), example_binary, and example_nominal).

-

Examples

diff --git a/dev/reference/assert_forecast_type.html b/dev/reference/assert_forecast_type.html index 72b1734e..f2b8d578 100644 --- a/dev/reference/assert_forecast_type.html +++ b/dev/reference/assert_forecast_type.html @@ -53,7 +53,7 @@

Argumentsdata -

A forecast object (see as_forecast()).

+

A forecast object.

actual
diff --git a/dev/reference/check_duplicates.html b/dev/reference/check_duplicates.html index f6a83b59..e01530f8 100644 --- a/dev/reference/check_duplicates.html +++ b/dev/reference/check_duplicates.html @@ -57,8 +57,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

diff --git a/dev/reference/clean_forecast.html b/dev/reference/clean_forecast.html index ae662f30..14b9f8b3 100644 --- a/dev/reference/clean_forecast.html +++ b/dev/reference/clean_forecast.html @@ -57,7 +57,7 @@

Argumentsforecast

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values).

copy
diff --git a/dev/reference/example_binary.html b/dev/reference/example_binary.html index 9e0f47e8..6420cf91 100644 --- a/dev/reference/example_binary.html +++ b/dev/reference/example_binary.html @@ -53,8 +53,8 @@

Usage

Format

-

An object of class forecast_binary (see as_forecast()) with the -following columns:

location
+

An object of class forecast_binary (see as_forecast_binary()) +with the following columns:

location

the country for which a prediction was made

location_name
diff --git a/dev/reference/example_nominal.html b/dev/reference/example_nominal.html index 684bcfa0..a396473c 100644 --- a/dev/reference/example_nominal.html +++ b/dev/reference/example_nominal.html @@ -53,8 +53,8 @@

Usage

Format

-

An object of class forecast_nominal (see as_forecast()) with the -following columns:

location
+

An object of class forecast_nominal +(see as_forecast_nominal()) with the following columns:

location

the country for which a prediction was made

target_end_date
diff --git a/dev/reference/example_point.html b/dev/reference/example_point.html index a915113c..5ccee313 100644 --- a/dev/reference/example_point.html +++ b/dev/reference/example_point.html @@ -56,8 +56,8 @@

Usage

Format

-

An object of class forecast_point (see as_forecast()) with the -following columns:

location
+

An object of class forecast_point (see as_forecast_point()) +with the following columns:

location

the country for which a prediction was made

target_end_date
diff --git a/dev/reference/example_quantile.html b/dev/reference/example_quantile.html index 29c4cffd..30a3f927 100644 --- a/dev/reference/example_quantile.html +++ b/dev/reference/example_quantile.html @@ -53,8 +53,8 @@

Usage

Format

-

An object of class forecast_quantile (see as_forecast()) with the -following columns:

location
+

An object of class forecast_quantile +(see as_forecast_quantile()) with the following columns:

location

the country for which a prediction was made

target_end_date
diff --git a/dev/reference/example_sample_continuous.html b/dev/reference/example_sample_continuous.html index b01834df..5c4f8eeb 100644 --- a/dev/reference/example_sample_continuous.html +++ b/dev/reference/example_sample_continuous.html @@ -53,8 +53,8 @@

Usage

Format

-

An object of class forecast_sample (see as_forecast()) with the -following columns:

location
+

An object of class forecast_sample (see as_forecast_sample()) +with the following columns:

location

the country for which a prediction was made

target_end_date
diff --git a/dev/reference/example_sample_discrete.html b/dev/reference/example_sample_discrete.html index 02ee70d4..c1213d05 100644 --- a/dev/reference/example_sample_discrete.html +++ b/dev/reference/example_sample_discrete.html @@ -53,8 +53,8 @@

Usage

Format

-

An object of class forecast_sample (see as_forecast()) with the -following columns:

location
+

An object of class forecast_sample (see as_forecast_sample()) +with the following columns:

location

the country for which a prediction was made

target_end_date
diff --git a/dev/reference/figures/metrics-nominal.png b/dev/reference/figures/metrics-nominal.png new file mode 100644 index 00000000..9991d6d0 Binary files /dev/null and b/dev/reference/figures/metrics-nominal.png differ diff --git a/dev/reference/forecast_types.html b/dev/reference/forecast_types.html index 5797e7be..c3edff4d 100644 --- a/dev/reference/forecast_types.html +++ b/dev/reference/forecast_types.html @@ -44,45 +44,6 @@

Documentation template for forecast types

-
-

Forecast types and input formats

-

Various different forecast types / forecast formats are supported. At the -moment, those are:

  • point forecasts

  • -
  • binary forecasts ("soft binary classification")

  • -
  • nominal forecasts ("soft classification with multiple unordered classes")

  • -
  • Probabilistic forecasts in a quantile-based format (a forecast is -represented as a set of predictive quantiles)

  • -
  • Probabilistic forecasts in a sample-based format (a forecast is represented -as a set of predictive samples)

  • -

Forecast types are determined based on the columns present in the input data. -Here is an overview of the required format for each forecast type: - -

-
-

All forecast types require a data.frame or similar with columns observed -predicted, and model.

-

Point forecasts require a column observed of type numeric and a column -predicted of type numeric.

-

Binary forecasts require a column observed of type factor with exactly -two levels and a column predicted of type numeric with probabilities, -corresponding to the probability that observed is equal to the second -factor level. See details here for more information.

-

Nominal forecasts require a column observed of type factor with N levels, -(where N is the number of possible outcomes), a column predicted of type -numeric with probabilities (which sum to one across all possible outcomes), -and a column predicted_label of type factor with N levels, denoting the -outcome for which a probability is given. Forecasts must be complete, i.e. -there must be a probability assigned to every possible outcome.

-

Quantile-based forecasts require a column observed of type numeric, -a column predicted of type numeric, and a column quantile_level of type -numeric with quantile-levels (between 0 and 1).

-

Sample-based forecasts require a column observed of type numeric, -a column predicted of type numeric, and a column sample_id of type -numeric with sample indices.

-

For more information see the vignettes and the example data -(example_quantile, example_sample_continuous, example_sample_discrete, -example_point(), example_binary, and example_nominal).

-

Forecast unit

In order to score forecasts, scoringutils needs to know which of the rows @@ -103,13 +64,12 @@

Forecast unitas_forecast() -functions. This will drop unneeded columns, while making sure that all -necessary, 'protected columns' like "predicted" or "observed" are retained.

+using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

-

+

diff --git a/dev/reference/get_forecast_unit.html b/dev/reference/get_forecast_unit.html index c4d814db..127e6b84 100644 --- a/dev/reference/get_forecast_unit.html +++ b/dev/reference/get_forecast_unit.html @@ -69,8 +69,8 @@

Argumentsdata

A data.frame (or similar) with predicted and observed values. -See the details section of as_forecast() for additional information -on required input formats.

+See the details section of for additional information +on the required input format.

@@ -98,9 +98,9 @@

Forecast unitas_forecast() -functions. This will drop unneeded columns, while making sure that all -necessary, 'protected columns' like "predicted" or "observed" are retained.

+using the forecast_unit argument. This will simply drop unneeded columns, +while making sure that all necessary, 'protected columns' like "predicted" +or "observed" are retained.

diff --git a/dev/reference/get_metrics.forecast_nominal.html b/dev/reference/get_metrics.forecast_nominal.html index 5b9ce0e6..a538689a 100644 --- a/dev/reference/get_metrics.forecast_nominal.html +++ b/dev/reference/get_metrics.forecast_nominal.html @@ -61,7 +61,7 @@

Argumentsx

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values, see as_forecast_binary()).

select
@@ -105,7 +105,7 @@

Examples#> logs <- -log(pred_for_observed) #> return(logs) #> } -#> <bytecode: 0x55dc52980ec8> +#> <bytecode: 0x5626c9ce0070> #> <environment: namespace:scoringutils> #>

diff --git a/dev/reference/get_metrics.forecast_point.html b/dev/reference/get_metrics.forecast_point.html index 067e64eb..5c78360b 100644 --- a/dev/reference/get_metrics.forecast_point.html +++ b/dev/reference/get_metrics.forecast_point.html @@ -97,7 +97,7 @@

Argumentsx

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values, see as_forecast_binary()).

select
@@ -143,7 +143,7 @@

Examples#> { #> return(ae(actual, predicted)/abs(actual)) #> } -#> <bytecode: 0x55dc550ceb38> +#> <bytecode: 0x5626cdf8d760> #> <environment: namespace:Metrics> #> diff --git a/dev/reference/get_metrics.forecast_quantile.html b/dev/reference/get_metrics.forecast_quantile.html index f8d924f8..d63b3e64 100644 --- a/dev/reference/get_metrics.forecast_quantile.html +++ b/dev/reference/get_metrics.forecast_quantile.html @@ -109,7 +109,7 @@

Argumentsx

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values, see as_forecast_binary()).

select
@@ -191,7 +191,7 @@

Examples#> return(reformatted$wis) #> } #> } -#> <bytecode: 0x55dc56d4c188> +#> <bytecode: 0x5626cbe3e440> #> <environment: namespace:scoringutils> #>

diff --git a/dev/reference/get_metrics.forecast_sample.html b/dev/reference/get_metrics.forecast_sample.html index 6fcc6936..b7fc2424 100644 --- a/dev/reference/get_metrics.forecast_sample.html +++ b/dev/reference/get_metrics.forecast_sample.html @@ -88,7 +88,7 @@

Argumentsx

A forecast object (a validated data.table with predicted and -observed values, see as_forecast()).

+observed values, see as_forecast_binary()).

select
@@ -141,7 +141,7 @@

Examples#> return(res) #> } #> } -#> <bytecode: 0x55dc57067b00> +#> <bytecode: 0x5626c5a06138> #> <environment: namespace:scoringutils> #> #> $dss @@ -150,7 +150,7 @@

Examples#> assert_input_sample(observed, predicted) #> scoringRules::dss_sample(y = observed, dat = predicted, ...) #> } -#> <bytecode: 0x55dc55ab8df0> +#> <bytecode: 0x5626c9da6bb8> #> <environment: namespace:scoringutils> #> #> $crps @@ -182,7 +182,7 @@

Examples#> return(crps) #> } #> } -#> <bytecode: 0x55dc54553238> +#> <bytecode: 0x5626cd6a3b30> #> <environment: namespace:scoringutils> #> #> $overprediction @@ -192,7 +192,7 @@

Examples#> ...) #> return(crps$overprediction) #> } -#> <bytecode: 0x55dc57e45db8> +#> <bytecode: 0x5626cd07aed8> #> <environment: namespace:scoringutils> #> #> $underprediction @@ -202,7 +202,7 @@

Examples#> ...) #> return(crps$underprediction) #> } -#> <bytecode: 0x55dc57e490c0> +#> <bytecode: 0x5626cd07a3b0> #> <environment: namespace:scoringutils> #> #> $dispersion @@ -212,7 +212,7 @@

Examples#> ...) #> return(crps$dispersion) #> } -#> <bytecode: 0x55dc57e48528> +#> <bytecode: 0x5626cd07d6b8> #> <environment: namespace:scoringutils> #> #> $log_score @@ -222,7 +222,7 @@

Examples#> scoringRules::logs_sample(y = observed, dat = predicted, #> ...) #> } -#> <bytecode: 0x55dc57e47a00> +#> <bytecode: 0x5626cd07cb90> #> <environment: namespace:scoringutils> #> #> $ae_median @@ -234,7 +234,7 @@

Examples#> ae_median <- abs(observed - median_predictions) #> return(ae_median) #> } -#> <bytecode: 0x55dc55b6ea58> +#> <bytecode: 0x5626cd47e540> #> <environment: namespace:scoringutils> #> #> $se_mean @@ -245,7 +245,7 @@

Examples#> se_mean <- (observed - mean_predictions)^2 #> return(se_mean) #> } -#> <bytecode: 0x55dc57e49f08> +#> <bytecode: 0x5626cd07f098> #> <environment: namespace:scoringutils> #>

diff --git a/dev/reference/get_metrics.html b/dev/reference/get_metrics.html index bdb1a30a..d9fccc2a 100644 --- a/dev/reference/get_metrics.html +++ b/dev/reference/get_metrics.html @@ -1,7 +1,7 @@ -Get metrics — get_metrics • scoringutilsGet metrics — get_metrics • scoringutils -

Generic function to to obtain default metrics availble for scoring or metrics -that were used for scoring.