-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathAppendixS2.Rmd
164 lines (132 loc) · 7.7 KB
/
AppendixS2.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
---
title: "Appendix S2"
output:
pdf_document:
fig_caption: false
header-includes:
- \usepackage{caption}
- \usepackage{hyperref}
- \usepackage{svg}
---
\captionsetup[table]{labelformat=empty}
```{r, include=FALSE}
options(tinytex.engine_args = '-shell-escape')
```
# Matthew T. Farr, David S. Green, Kay E. Holekamp, and Elise F. Zipkin
# Integrating distance sampling and presence-only data to estimate species abundance
# Ecology
\href{https://doi.org/10.5281/zenodo.3981242}{\includesvg{DOI}}
# Simulation study
This appendix describes the parameter values used in the simulation study.
### Section S1. Parameter and covariate values:
The intercept parameter for the biological process was randomly drawn as follows: $\lambda_0 \sim uniform(0.05, 1)$. This equated to a range of 500 to 10,000 individuals for region $A$ (assuming no covariate effects). A single effect parameter on the biological process was also assumed to come from a uniform distribution: $\beta_1 \sim uniform(-1.25, 1.25)$. The covariate values at each pixel across region $A$ for the biological process came from a correlated multivariate normal distribution using Euclidean distances between pixels to define the variance-covariance matrix (Kéry & Royle 2016, pg. 534). We standardized the covariate values to have a mean of 0 and standard deviation of 1. Expected pixel density ranged between 0 and 85 based on the magnitude and values of the effect parameter and covariate. The intercept parameter for the observation bias of opportunistic sampling (presence-only data) was predetermined as either: $p_0 = 0.5$ or $p_0 = 0.1$, based on whether we assumed a low or high quantity of presence-only data. The covariate effect for opportunistic sampling was drawn from a random uniform distribution: $\alpha_1 \sim uniform(0, 2)$. The covariate values for the observation process of opportunistic sampling were also assumed to come from a multivariate normal distribution (Dorazio 2014). We created separate covariates for high and low quantities of presence-only data by adjusting the variance of the multivariate normal distribution, which translated to larger or smaller (high vs low) sampling intensity. The scale parameter for the distance sampling observational process was drawn from a uniform distribution: $\sigma \sim uniform(0.75, 1.25)$, which led to a detection probability ranging from approximately from 19 to 31%.
```{r, include = FALSE, message = FALSE}
#Libaries
library(abind)
library(dplyr)
library(knitr)
library(kableExtra)
#List of all filenames
filenames <- list.files(path = "~/IDM/DataAnalysis/Simulations/SimulationOutput", pattern = "output", full.names = TRUE)
#Load first file
load(filenames[1])
#Initialize vector for all output
Out <- output$Out
Out2 <- output$Out2
#Time vector
Time <- output$Time
#Harvest parameters from files and remove model runs with Rhat > 1.1
for(i in 2:length(filenames)){
load(filenames[i])
for(j in 1:length(output$Out[,1,1])){
if(max(output$Out[j,c(18,20,22,24,28:31),2:6], na.rm = TRUE) < 1.1){
#if(max(output$Out[j,c(18:25,27:31),2:6], na.rm = TRUE) < 1.1){
Out <- abind(Out, output$Out[j,,], along = 1)
Out2 <- rbind(Out2, output$Out2[j,])
}
}
Time <- c(Time, output$Time)
}
#Remove first sample if Rhat > 1.1
if(max(Out[1,c(18,20,22,24,28:31),2:6], na.rm = TRUE) < 1.1){
Out <- Out[-1,,]
Out2 <- Out2[-1,]
}
#Sample 1000 iterations
set.seed(123)
iter <- sort(sample(dim(Out)[1], 1000, replace = FALSE))
Out <- Out[iter,,]
name <- c("0", "5", "10", "15", "20")
name <- as.character(name)
name <- factor(name, levels=unique(name))
truth <- Out[,rep(1,10),]
truth[,1:5,5] <- 0.5
truth[,6:10,5] <- 0.1
y75 <- apply(Out[,c(2,3,5,7,9,12:16),] - truth, MARGIN = c(2,3), FUN = quantile, probs = 0.75, na.rm = TRUE)
y50 <- apply(Out[,c(2,3,5,7,9,12:16),] - truth, MARGIN = c(2,3), FUN = quantile, probs = 0.5, na.rm = TRUE)
y25 <- apply(Out[,c(2,3,5,7,9,12:16),] - truth, MARGIN = c(2,3), FUN = quantile, probs = 0.25, na.rm = TRUE)
ymax <- ((y75 - y25) * 1.5) + y75
ymin <- y25 - ((y75 - y25) * 1.5)
df <- abind(ymin,y25,y50,y75,ymax, along = 3)
```
### Section S2. Simulation results:
Below are the bias (estimated - truth) results for each parameter with the interquartile range and $\pm$ 1.5 the interquartile range. Each row represents a different scenario for the various data quantities. Scenarios 1-5 are for high quantities of presence-only data for each of 0, 5, 10, 15, and 20% distance sampling coverage. Scenarios 6-10 are for low quantities of presence-only data for each level of distance sampling coverage.
\pagebreak
Table S1. Parameter Abundance (N)
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,1,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
Table S2. Parameter $\lambda_0$
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,2,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
Table S3. Parameter $\beta_1$
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,3,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
\pagebreak
Table S4. Parameter $\sigma$
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,4,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
Table S5. Parameter $p_0$
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,5,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
Table S6. Parameter $\alpha_1$
```{r, echo = FALSE, results = 'asis'}
tmp <- df[,6,]
colnames(tmp) <- c("min", "25", "50", "75", "max")
rownames(tmp) <- c("High PO, 0% DS", "High PO, 5% DS", "High PO, 10% DS", "High PO, 15% DS", "High PO, 20% DS",
"Low PO, 0% DS", "Low PO, 5% DS", "Low PO, 10% DS", "Low PO, 15% DS", "Low PO, 20% DS")
kable(tmp, digits = 2, longtable = TRUE, booktabs = TRUE, linesep = "") %>%
kable_styling(position = "left")
```
##### Literature Cited
Dorazio, R.M. (2014) Accounting for imperfect detection and survey bias in statistical analysis of presence-only data. Global Ecology and Biogeography, 23, 1472–1484.
Kéry, M. & Royle, J.A. (2016) Applied hierarchical modeling in ecology: Analysis of distribution, abundance and species richness in R and BUGS (volume 1 – prelude and static models), Elsevier, Amsterdam.