-
Notifications
You must be signed in to change notification settings - Fork 3
/
README.Rmd
156 lines (106 loc) · 4.56 KB
/
README.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
---
output:
md_document:
variant: markdown_github
---
<!-- README.md is generated from README.Rmd. Please edit that file -->
<!-- badges: start -->
[![R-CMD-check](https://github.com/stan-dev/posteriordb-r/actions/workflows/check-release.yaml/badge.svg)](https://github.com/stan-dev/posteriordb-r/actions/workflows/check-release.yaml) [![Codecov test coverage](https://codecov.io/gh/stan-dev/posteriordb-r/branch/main/graph/badge.svg)](https://codecov.io/gh/stan-dev/posteriordb-r?branch=main)
<!-- badges: end -->
# `posteriordb`: an R package to work with `posteriordb`
This repository contains the R package to efficiently work with the [posteriordb](https://github.com/stan-dev/posteriordb) repository. The R package includes convenience functions to access data, model code and information for individual posteriors, models, data and draws.
## Installation
To install only the R package and then access the posteriors remotely, install the package from GitHub using the `remotes` package.
```{r, eval = FALSE}
remotes::install_github("stan-dev/posteriordb-r")
```
To load the package, just run.
```{r}
library(posteriordb)
```
## Connect to the posterior database
First, we create the posterior database connection to use. Here we want to use the database locally. We assume the `posteriordb` repo has been cloned and is accessible locally.
```{r, eval=FALSE}
my_pdb <- pdb_local()
```
```{r, eval=TRUE, echo=FALSE}
# Store the PDB_PATH in .env
dotenv::load_dot_env()
my_pdb <- pdb_local(Sys.getenv("PDB_PATH"))
```
The above code requires that your working directory be the cloned repository's main folder. Otherwise, we can use the `path` argument in `pdb_local()` to point to the local posterior database. We can also set the environment variable `PBD_PATH` to handle the connection. For more details, see `?pdb`.
The most straightforward approach is to use the GitHub repository directly to access the database.
```{r, eval=FALSE}
my_pdb <- pdb_github()
```
When you have a connection to the posterior database of choice, you can access the data, models etc., using the same functionality.
## Contributing content using R
If you want to contribute to a posteriordb, see the vignette [vignettes/contributing](https://htmlpreview.github.io/?https://github.com/stan-dev/posteriordb-r/blob/main/vignettes/contributing.html).
## Access content
To list the posteriors in the database, use `posterior_names()`.
```{r}
pos <- posterior_names(my_pdb)
head(pos)
```
In the same fashion, we can list data and models included in the database as
```{r}
mn <- model_names(my_pdb)
head(mn)
dn <- data_names(my_pdb)
head(dn)
```
We can also get all information on each posterior as a table with
```{r}
pos <- posteriors_tbl_df(my_pdb)
head(pos)
```
The posterior's name is made up of the data and model fitted
to the data. Together, these two uniquely define a posterior distribution.
To access a posterior object, we can use the posterior name.
```{r}
po <- posterior("eight_schools-eight_schools_centered", my_pdb)
```
From the posterior object, we can access data, model code (i.e., Stan code in this case) and other useful information.
```{r}
dat <- pdb_data(po)
dat
code <- stan_code(po)
code
```
We can also access the paths to data after they have been unzipped and copied to the cache directory set in `pdb` (the R temp directory by default).
```{r}
dfp <- data_file_path(po)
dfp
scfp <- stan_code_file_path(po)
scfp
```
We can also access information regarding the model and the data used to compute the posterior.
```{r}
data_info(po)
model_info(po)
```
Note that the references reference BibTeX items found in `content/references/references.bib`.
We can access most of the posterior information as a `tbl_df` using
```{r}
tbl <- posteriors_tbl_df(my_pdb)
head(tbl)
```
In addition, we can also access a list of posteriors with `filter_posteriors()`. The filtering function follows dplyr filter semantics.
```{r}
pos <- filter_posteriors(pdb = my_pdb, data_name == "eight_schools")
pos
```
To access reference posterior draws, we use `reference_posterior_draws()`.
```{r}
rpd <- reference_posterior_draws(po)
```
The function `reference_posterior_draws()` returns a posterior `draws_list` object that can be summarized and transformed using the `posterior` package.
```{r}
posterior::summarize_draws(rpd)
```
To access information on the reference posterior we can use `reference_posterior_draws_info()` or use `info()` on the reference posterior. The posterior reference draws return information on how the reference posterior was computed.
```{r}
rpi <- reference_posterior_draws_info(po)
rpi
info(rpd)
```