Posterior methodologies with Random Forests #319

fradav · 2020-01-23T09:26:40Z

Summary:

Currently testing a python module wrapping https://github.com/diyabc/abcranger : posterior methodologies (model choice and parameter estimation) with Random Forests on reference table.
(See the references)

Description:

I would like to know the best way to integrate the posterior methodologies into the elfi pipeline. It seems any inference method in elfi should have an "iterate" method with every new sample, but both methodologies haven't got any (they need the whole reference table at once)

See the demos at :
https://github.com/diyabc/abcranger/blob/master/testpy/Model%20Choice%20Demo.ipynb
and
https://github.com/diyabc/abcranger/blob/master/testpy/Parameter%20Estimation%20Demo.ipynb

Note that the basic rejection sampler is more than enough with those methodologies (and the threshold parameter almost doesn't matter).

Regards,

hpesonen · 2020-01-24T09:13:52Z

Hi! Could you clarify a bit what you mean by posterior methodologies and integration of them into ELFI pipeline? e.g. would you like to implement RF-ABC within ELFI? In this case iterate could still used when producing table in batches.

Note that If you don't care about the threshold for rejABC and only would to generate a reference table from the ELFI-model, you can also set quantile = 1.0 in sample-method.

fradav · 2020-01-24T13:21:39Z

Hi,

I'm working with J-M. Marin, and posterior RF methodologies like model choice and parameter estimation work directly on ABC reference tables, as stated in :

ABC model choice (Pudlo et al. 2015)
ABC Bayesian parameter inference (Raynal et al.
2018)

By integration in ELFI, I originally mean to implement a new inference method like documented there.

I am not sure about batch processing. RF-ABC prediction performance degrades a lot if you take only small subset of the data. I don't know either how to "accumulate" posterior results from successive batches any other than retraining a forest on all past batches, which of course defeats the batch's purpose. I think this is a perhaps use case for "mondrian" forests; not classical Breiman's rf like the ones we use, but mondrian forests (Lakshminarayanan, Roy, and Teh
2014) are a totally different beast, and there is a lot caveats to them vs Breiman's (sensibility to noise is one of them). Anyway, this is an interesting track for future work.

Threshold doesn't matter "much" with RF-ABC, but it doesn't mean we shouldn't have one, so I think quantile = 1.0 isn't recommended either (I'll double-check this with JM Marin).

References

Pudlo, Pierre, Jean-Michel Marin, Arnaud Estoup, Jean-Marie Cornuet,
Mathieu Gautier, and Christian P Robert. 2015. “Reliable Abc Model
Choice via Random Forests.” Bioinformatics 32 (6): 859–66.

Raynal, Louis, Jean-Michel Marin, Pierre Pudlo, Mathieu Ribatet,
Christian P Robert, and Arnaud Estoup. 2018. “ABC random forests for
Bayesian parameter inference.” Bioinformatics 35 (10): 1720–8.
https://doi.org/10.1093/bioinformatics/bty867.

Lakshminarayanan, Balaji, Daniel M Roy, and Yee Whye Teh. 2014.
“Mondrian Forests: Efficient Online Random Forests.” In Advances in
Neural Information Processing Systems, 3140–8.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Posterior methodologies with Random Forests #319

Posterior methodologies with Random Forests #319

fradav commented Jan 23, 2020

hpesonen commented Jan 24, 2020

fradav commented Jan 24, 2020

Posterior methodologies with Random Forests #319

Posterior methodologies with Random Forests #319

Comments

fradav commented Jan 23, 2020

Summary:

Description:

hpesonen commented Jan 24, 2020

fradav commented Jan 24, 2020

References