Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Posterior methodologies with Random Forests #319

Open
fradav opened this issue Jan 23, 2020 · 2 comments
Open

Posterior methodologies with Random Forests #319

fradav opened this issue Jan 23, 2020 · 2 comments

Comments

@fradav
Copy link

fradav commented Jan 23, 2020

Summary:

Currently testing a python module wrapping https://github.com/diyabc/abcranger : posterior methodologies (model choice and parameter estimation) with Random Forests on reference table.
(See the references)

Description:

I would like to know the best way to integrate the posterior methodologies into the elfi pipeline. It seems any inference method in elfi should have an "iterate" method with every new sample, but both methodologies haven't got any (they need the whole reference table at once)

See the demos at :
https://github.com/diyabc/abcranger/blob/master/testpy/Model%20Choice%20Demo.ipynb
and
https://github.com/diyabc/abcranger/blob/master/testpy/Parameter%20Estimation%20Demo.ipynb

Note that the basic rejection sampler is more than enough with those methodologies (and the threshold parameter almost doesn't matter).

Regards,

@hpesonen
Copy link
Member

Hi! Could you clarify a bit what you mean by posterior methodologies and integration of them into ELFI pipeline? e.g. would you like to implement RF-ABC within ELFI? In this case iterate could still used when producing table in batches.

Note that If you don't care about the threshold for rejABC and only would to generate a reference table from the ELFI-model, you can also set quantile = 1.0 in sample-method.

@fradav
Copy link
Author

fradav commented Jan 24, 2020

Hi,

I'm working with J-M. Marin, and posterior RF methodologies like model choice and parameter estimation work directly on ABC reference tables, as stated in :

  • ABC model choice (Pudlo et al. 2015)
  • ABC Bayesian parameter inference (Raynal et al.
    2018)

By integration in ELFI, I originally mean to implement a new inference method like documented there.

I am not sure about batch processing. RF-ABC prediction performance degrades a lot if you take only small subset of the data. I don't know either how to "accumulate" posterior results from successive batches any other than retraining a forest on all past batches, which of course defeats the batch's purpose. I think this is a perhaps use case for "mondrian" forests; not classical Breiman's rf like the ones we use, but mondrian forests (Lakshminarayanan, Roy, and Teh
2014) are a totally different beast, and there is a lot caveats to them vs Breiman's (sensibility to noise is one of them). Anyway, this is an interesting track for future work.

Threshold doesn't matter "much" with RF-ABC, but it doesn't mean we shouldn't have one, so I think quantile = 1.0 isn't recommended either (I'll double-check this with JM Marin).

References

Pudlo, Pierre, Jean-Michel Marin, Arnaud Estoup, Jean-Marie Cornuet,
Mathieu Gautier, and Christian P Robert. 2015. “Reliable Abc Model
Choice via Random Forests.” Bioinformatics 32 (6): 859–66.

Raynal, Louis, Jean-Michel Marin, Pierre Pudlo, Mathieu Ribatet,
Christian P Robert, and Arnaud Estoup. 2018. “ABC random forests for
Bayesian parameter inference.” Bioinformatics 35 (10): 1720–8.
https://doi.org/10.1093/bioinformatics/bty867.

Lakshminarayanan, Balaji, Daniel M Roy, and Yee Whye Teh. 2014.
“Mondrian Forests: Efficient Online Random Forests.” In Advances in
Neural Information Processing Systems
, 3140–8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants