-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add coverage deviation as a metric #417
Conversation
…he name is terrible
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## rework-add_coverage() #417 +/- ##
========================================================
Coverage ? 78.97%
========================================================
Files ? 20
Lines ? 1774
Branches ? 0
========================================================
Hits ? 1401
Misses ? 373
Partials ? 0 ☔ View full report in Codecov by Sentry. |
…intervals from quantiles
… and other functions with the same arguments
… other data documentation (seems it was ignored anyway)
…ion of coverage_deviation per quantile as a metric
…or now that cannot be run anymore.
…differing quantiles
…ame of the coverage columns
…eviously failing due to `add_coverage()`
…cause we can't pass the `forecast_unit` arg to apply_metrics
…wasn't working and everything else feels more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aside from not being totally happy with where we have landed with one vector as the output for a metric (which leads us to this place) I think this is all fine.
Isn't this interval deviation though? Don't we need both?
@@ -32,6 +32,7 @@ export(crps_sample) | |||
export(dispersion) | |||
export(dss_sample) | |||
export(get_duplicate_forecasts) | |||
export(interval_coverage_deviation_quantile) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so the reason we have to do all this stuff is we are only supporting one metric per function call vs allowing the output to be a data.frame and unnesting?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, we could have all functions return a data.frame and then merging things together in score()
.
Or a wrapper in score()
that checks whether the output is a data.frame, list or a vector? I'd be open to that, I just haven't thought about it long enough to have a good vision for how it would look like.
I created an issue, $455. Could you please elaborate a bit on how you'd imagine this?
#' ) | ||
#' quantile <- c(0.1, 0.25, 0.5, 0.75, 0.9) | ||
#' interval_coverage_deviation_quantile(observed, predicted, quantile) | ||
interval_coverage_deviation_quantile <- function(observed, predicted, quantile) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this feels like it has a lot of duplication from add_coverage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened an issue #454
Update functions to compute the absolute median
…uantile()2 Rework score.scoringutils quantile()2
Move tests around
Expand tests2
Rework add coverage to work with raw forecasts
Simplify `score()`
Fix set forecast unit
Yeah that's interval_coverage_deviation. I created an issue #453 as a reminder to create an equivalent function for quantile coverage |
This PR adds coverage deviation as a new metric. The function is obviously named terribly.
Maybe
interval_coveragedeviation_quantile()
is better?At the moment, just
coverage_deviation
or something like that would also work. But in theory you could also doquantile_coverage_deviation
that does the same thing for quantiles instead of intervals...I guess a question is whether we think it's a useful metric overall? Not great, not terrible?