-
-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Weighted rvars #331
base: master
Are you sure you want to change the base?
Weighted rvars #331
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #331 +/- ##
==========================================
+ Coverage 95.31% 95.80% +0.49%
==========================================
Files 50 51 +1
Lines 3840 3979 +139
==========================================
+ Hits 3660 3812 +152
+ Misses 180 167 -13 ☔ View full report in Codecov by Sentry. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Currently, everything else than
Pinging @n-kall , too |
For reference: Equations 6 (MCSE) and 7 (ESS) in preprint v6
Any thoughts on how the two sets of diagnostics should be presented in summarise_draws? |
I'm currently working on updating the |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
updates to weighted diagnostics
Weighted diagnostics
Summary
This PR aims to address (at least part of) #184 by implementing weighted
rvar
s.Currently,
rvar
s cannot contain weights, and weighting of them can only be done by putting them in adraws_rvars
object that itself contains a".log_weight"
rvar
containing the weights. This leads to counterintuitive behavior, like the default output of thervar
(showing mean and sd) using unweighted versions of those statistics.This PR addresses that issue in the following ways:
rvar
weights as a"log_weights"
attribute directly on thervar
, just like the"nchains"
attribute is used to store chain count.draws_rvars
no longer use a".log_weight"
variable to store weights, instead storing them directly on eachrvar
they contain, and requiring allrvar
s they contain to have the same weights (the same way it handles"nchains"
).log_weights()
function fordraws
andrvar
s is added, which is a lower-level version ofweights(x, log = TRUE, normalize = FALSE)
that just returns the log weights stored in the object without transformation. I initially did not have this, but found it greatly eased programming with weights.weight_draws(x, NULL)
is now allowed as the canonical way to remove weights from adraws
object, sinceremove_variables(x, ".log_weight")
does not work ondraws_rvars
objects anymore.rvar
s have been updated to incorporate weights (with a couple of exceptions I haven't gotten to yet, see TODOs and Questions below).rvar
internals are becoming (even more) complicated, I have added an "rvar
Internals" section to?rvar
that hopefully will help in case others need to touch the code ;).Demo
You can't combine two rvars with different weights:
The check for equality of weights is done on the internal weights using
identical()
, which should be fast, especially in cases where the two weight vectors are actually pointers to the same vector in memory (in which case the comparison is constant time). This does mean the weights vectors must be exactly the same (no tolerance for floating point error), but I suspect in most cases when weighting happens the exact same weight vector is being applied to many objects. In any case, if someone did encounter this issue they could simply assign the log weights from object to the other.If one rvar is weighted and another is not, the weights of the weighted rvar are inherited, which I believe covers the use case of (weighted draws from some model) + (unweighted draws, e.g. used to simulate predictions):
If you install the dev version of {ggdist}:
It supports weighted rvars in all functions (densities, CDFs, quantiles, all interval types and all point summaries):
Without weights:
With weights:
Weights should work basically everywhere:
TODOs and Questions
TODOs:
density(<rvar>)
,cdf(<rvar>)
, andquantile(<rvar>)
/quantile2(<rvar>)
. The first two are straightforward. For weighted quantiles, I have an implementation in {ggdist} that I can port over, but I may want to update it first; some thoughts on weighted quantiles are here and feedback is welcome. (In fact, since writing that document my thinking has changed a bit---I originally thought the way I suggested implementing weighted quantiles in that document is an improvement on ggdist's current implementation, but after further investigation I might be leaning back towards how I did it in ggdist originally...).vignette("rvar")
Questions:
R/convergence.R
should be modified for weighted rvars. @avehtari?Would love for folks to kick the tires. I think once this is in we could also start thinking about what a successor to
summarise_draws()
might look like that supports weights (and solves the various other open issues onsummarise_draws()
).Copyright and Licensing
By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses: