-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Data comparison mini DSL #130
Comments
I like the inline approach, feels more in keeping with current odin? |
to be written also like this
|
I have much less experience with the fitting code and haven't followed all the discussions around this, so to be taken with a pinch of salt. However (before seeing previous comments on this thread) my gut feeling would have been that the block bit is best (and I'm not a Stan user so not really influenced by that). I think that this is not nice: I agree with @MJomaba that LHS should allow data but also parameters, to allow inference of "missing data" (and hierarchical models / hyperpriors I think). Also agree with @MJomaba re the "recalculation" of parameters, that would be nice if possible. Finally, I like |
Really exciting stuff @richfitz ! Agreed re the LHS needing to be able to handle parameters (I think the hierarchical case @annecori mentions seems particularly pertinent). I think on balance I prefer the blocks version. The inline version would have been my (weakly held) preference but I think Anne's point about wanting to potentially fit the same model to different data (or with different likelihoods) seems very relevant. In that context, I think ideally you'd want to be able to use the (I am a Stan user, so that take might be biased) |
|
Thanks all for the comments so far. Some comments that might help with the above I can understand the desire for flexibility with multiple compares for a single model, but that will run up against a slight struggle in how we have dust set up; you might be able to switch between different compare definition files at compile time but fundamentally you'd still be ending up with a finished model that has baked in a single comparison function along side a single model. It's possible that we could allow some sort of "include this file" statement that would allow slurping in of arbitrary odin code in the case of complex models that you could then structure your model with. Does stan etc offer any sort of support for this? It's a fairly similar situation I'd imagine where some parts of that you might want to share between different things. Ed is right that the current approach allows for a certain degree of flexibility while baking in a single compare function. What will happen is that if you have
and you don't have observed data (i.e., it is present as a column in the data that you set, but is It does sound like we should be quite flexible about what goes on the lhs; I'll try and wire it up not to constrain things there at all |
I'm team inline, it's more in-keeping with current syntax, and would be just as flexible as the current system in terms of using different compares with the same model right? If I'm understanding, the compare function lives in a separate script to the odin code for the model. So you could have multiple alternative compares and choose between them at the point of creating the pfilter? E.g. I have my model and I want to fit with both a negbinom and a betabinom compare function, so I write a fitting task in orderly with a parameter that selects the relevant compare function, the script then 'bakes' it in and runs the pmcmc. I also like the
After we have fitted our model, we can't immediately visually compare the model fit to the data as we don't have any modelled quantity that corresponds to the observed In pomp, the rmeasure would generate: |
A few points to feed in discussion, some in response to previous comments from others:
|
Thanks all - this is super useful. There are obviously (or non-obviously) some internal constraints based on the design but I'll try and pull together a proof-of-concept in the next month or so |
Progress on this up here: #131 and mrc-ide/odin#294 I think that this looks pretty good and not that disruptive to the syntax tbh, and I think with fairly minimal changes it will support most of the above. Heaps to add (especially things like validation etc) but I'll expand that and try and capture our existing models with this approach (plus do the compilation out to get GPU functions too) |
Two proposals for a data modelling/probablistic language that we need in odin/odin.dust (initially this is only something in odin.dust)
Considerations:
data ~ distribution
syntax that people are familiar with from things like stan and bugs, but withoutx ~ dpois(lambda)
in favour ofx ~ poisson(lambda)
as we can then generate both simulations of data as well as likelihoods from dataupdate()
statement)The actual bits of differentiation and generating compiled code are not actually very hard once we have the interface done, and dust already supports almost everything that we need here.
We already have a few densities implemented in dust already and we know that this is enough to do the core bits required by sircovid once we deal with the generating process bits. This is easy enough to expand but there is a question of whether we want to make this user expandable (for now I think that we won't).
Here are two starting points with hypothetical implementations of the interface, for discussion. Neither work at present, and both require changes to be made in odin (as well as the implementation in odin.dust)
In blocks, like stan:
This is an excerpt of the support for the sircovid "basic" model showing the compare "block" and all the bits required to support it:
It's possible that we could move the actual compare bit into its own file:
with
compare.R
containing:This is slightly easier to implement but leads to things scattered in several files (we might do that same approach for the data too).
I don't love the
data
block here but struggle to come up with something nice. We could just infer all lhs of~
expressions as data and add something to declare additional ones. There's no real strong reason we need to have any declaration that things are real or integer as in practice everything is real, so we just need a list. We could also support something like:(with or without quotes). Or we could do something like
which captures the idea of the data inputs (arguments to the function) but is pretty rude about what a function is. Finally we could merge the two blocks like
Inline, like odin:
A different approach might look like new odin functions like this:
(here, we might want to rename
compare
as something else). This requires a little more surgery in odin but nothing too bad really. I think that this works quite well for the data inputs, and provides a natural way of declaring integer inputs if they become needed (deaths <- data(integer = TRUE)
perhaps).Data in the comparison function
One example we found with data on both the lhs and rhs is something like
where
observed
is some observed count data,n_tests
is some observed number of tests (so both of these from the data) andprob
is some model quantityThe text was updated successfully, but these errors were encountered: