-
Notifications
You must be signed in to change notification settings - Fork 0
/
03-ch6.Rmd
477 lines (366 loc) · 24.3 KB
/
03-ch6.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
# (PART) Lattice Data {-}
# Spatial Models on Lattices
Instead of continuous index set considered in Part 1, this part is concerned with the situation where the index set $D$ is a countable collection of spatial sites at which data are observed. The collection $D$ of such sites is called a lattice, which is then supplemented with neighborhood information.
## Lattices
> **Latices** \
Lattice refers to a countable collection of (spatial) sites, either spatially regular or irregular.
Set Sudden Infant Death Syndrome (SIDS) in North Carolina, 1974-1978 as an example. The lattice form of the data can be drawn as Figure \@ref(fig:lattice-example).
```{r, warning=FALSE, message=FALSE, echo=FALSE}
library(spdep)
nc <- st_read(system.file("shapes/sids.shp", package="spData")[1], quiet=TRUE)
st_crs(nc) <- "+proj=longlat +datum=NAD27"
row.names(nc) <- as.character(nc$FIPSNO)
```
```{r, warning=FALSE, message=FALSE, echo=FALSE}
gal_file <- system.file("weights/ncCC89.gal", package="spData")[1]
ncCC89 <- read.gal(gal_file, region.id=nc$FIPSNO)
```
```{r, lattice-example, fig.cap='Lattice Figure of Sudden Infant Death Syndrome (SIDS) in North Carolina, 1974-1978', warning=FALSE, message=FALSE, echo=FALSE}
plot(st_geometry(nc), border="grey")
plot(ncCC89, st_centroid(st_geometry(nc), of_largest_polygon), add=TRUE, col="blue")
plot(st_centroid(st_geometry(nc), of_largest_polygon)[c(71, 85, 67, 47)], add=TRUE, col="blue", lwd=4)
plot(st_centroid(st_geometry(nc), of_largest_polygon)[[70]], add=TRUE, col="red", lwd=6)
```
> **Neighborhoods** \
A site $k$ is defined to be a neighbor of site $i$ if the conditional distribution of $Z\left(s_{i}\right)$, given all other site values, depends functionally on $z\left(s_{k}\right)$, for $k \neq i$. Also define
\begin{equation}
N_{i} \equiv\{k: k \text { is a neighbor of } i\}
(\#eq:neighborhoods)
\end{equation}
to be the neighborhood set of site $i$.
Neighborhoods is important in lattice data. In SIDS data example, neighborhood relationships are drawn by blue line in Figure \@ref(fig:lattice-example).Here, counties whose seats are closer than 30 miles are deemed neighbors. For instance, if we consider the bold red points, four blue points around it are neighbors.
## Spatial Models for Lattice Data
In spatial domain, the conditional approach \@ref(conditional-spatial-models) and the simultaneous approach \@ref(simultaneous-spatial-models) result in different models.
### Simultaneous Spatial Models
#### General Form
For simplicity, consider the square lattice in the plane $\mathbb{R}^2$.
\begin{equation}
D=\{s=(u, v): u=\ldots,-2,-1,0,1,2, \ldots ; v=\ldots,-2,-1,0,1,2, \ldots\}
(\#eq:lattice)
\end{equation}
The joint approach is to assume
\begin{equation}
\operatorname{Pr}(\mathbf{z})=\prod_{(u, v) \in D} Q_{u v}(z(u, v) ; z(u-1, v), z(u+1, v), z(u, v-1), z(u, v+1)).
(\#eq:joint)
\end{equation}
Equation \@ref(eq:joint) suggests that the probability of values on a specific location is determined by values of its neighbors.
Model definition of simultaneous spatial models
\begin{equation}
\phi\left(T_{1}, T_{2}\right) Z(u, v)=\epsilon(u, v)
(\#eq:gaussian-sg)
\end{equation}
where
\begin{equation}
\left\{\begin{aligned}
& T_{1} Z(u, v)=Z(u+1, v) \\
& T_{1}^{-1} Z(u, v)=Z(u-1, v) \\
& T_{2} Z(u, v)=Z(u, v+1) \\
& T_{2}^{-1} Z(u, v)=Z(u, v-1)
\end{aligned}
\right.
(\#eq:details-gaussian-sg)
\end{equation}
and
\begin{equation}
\phi\left(T_{1}, T_{2}\right)=\sum_{i} \sum_{j} a_{i j} T_{1}^{i} T_{2}^{j}
(\#eq:details-gaussian-sg2)
\end{equation}
Notice that $\phi(c_1, c_2)\neq 0$ for any complex number $|c_1| = |c_2| = 1$ must be met to ensure obtaining a stationary process.
#### Simultaneously Specified Spatial Gaussian Models
For this section, assume that $\{Z(s): s \in D\}=\left\{Z\left(s_{i}\right): i=1, \ldots, n\right\}$ is defined on a finite subset of the integer lattice in the plane.
Suppose $\boldsymbol{\epsilon}=\left(\epsilon\left(s_{1}\right), \ldots, \epsilon\left(s_{n}\right)\right) \sim \operatorname{Gua}(0, \Lambda)$, where $\Lambda=\sigma^{2} I$ and $B=\left(b_{i j}\right)$, where $b_{ij} = 0$ means $s_1, s_2$ are independent. If $I - B$ is invertable, then simultaneously specified spatial gaussian model is writen as
\begin{equation}
(I-B)(\mathbf{Z}-\mu)=\epsilon
(\#eq:sg-form-matrix)
\end{equation}
Equally,
\begin{equation}
Z\left(s_{i}\right)=\mu_{i}+\sum_{j=1}^{n} b_{i j}\left(Z\left(s_{j}\right)-\mu_{j}\right)+\epsilon_{i}
(\#eq:sg-form-long)
\end{equation}
- From equation \@ref(eq:sg-form-matrix), we have $\mathbf{Z} \sim \operatorname{Gau}\left(\mu,(I-B)^{-1} \Lambda\left(I-B^{\top}\right)^{-1}\right)$.
- Equation \@ref(eq:sg-form-long), this is a spatial analogue of the autoregressive model.
- Note that $\left.\left.\operatorname{Cov}(\boldsymbol{\epsilon}, \mathbf{Z})=\operatorname{cov}\left(\boldsymbol{\epsilon},(I-B)^{-1}\right) \boldsymbol{\epsilon}\right)=\Lambda\left(I-B^{\top}\right)^{-1}\right)$ is not diagonal, which means the error is not independent of the autoregressive variables. It theoretically suspects the consistency of least-squares estimators.
### Conditional Spatial Models
#### General Form
The spatial conditional approach assumes
\begin{equation}
\begin{aligned}
&\mathbf{P}(z(u, v) \mid\{z(k, l):(k, l) \neq(u, v)\}) \\
\quad=&\mathbf{P}(z(u, v) \mid z(u-1, v), z(u+1, v), z(u, v-1), z(u, v+1)), \text { for all }(u, v)^{\prime} \in D.
\end{aligned}
(\#eq:conditional-lattice)
\end{equation}
The conditional approach also suggests that the probability of values on a specific location is determined by values of its neighbors, but in a different way from simultaneous approach in section \@ref(simultaneous-spatial-models)
#### Conditionally Specified Spatial Gaussian Models
For Gaussian data defined in section \@ref(simultaneously-specified-spatial-gaussian-models), the conditional model can be written as
\begin{equation}
\left.Z\left(s_{i}\right) \mid\left\{Z\left(s_{j}\right): j \neq i\right\}\right) \sim \operatorname{Gau}\left(\theta_{i}\left(\left\{Z\left(s_{j}\right): j \neq i\right\}\right), \tau_{i}^{2}\right)
(\#eq:cg-model)
\end{equation}
Addditionally, we suppose ”pairwise-only dependence” between sites. Therefore we can write $\theta_i$ as a linear function
\begin{equation}
\theta_{i}\left(\left\{Z\left(s_{j}\right): j \neq i\right\}\right)=\mu_{i}+\sum_{j=1}^{n} c_{i j}\left(z\left(s_{j}\right)-\mu_{j}\right)
(\#eq:cg-pairwise)
\end{equation}
where $c_{i j} \tau_{j}^{2}=c_{j i} \tau_{i}^{2}, c_{i i}=0$. $c_{i k}=0$ means there is no dependence between sites $s_i$ and $s_k$. Then we have
\begin{equation}
\mathbf{Z} \sim \operatorname{Gau}\left(\mu,(I-C)^{-1} M\right)
(\#eq:cg-z)
\end{equation}
where $M=\operatorname{diag}\left(\tau_{1}^{2}, \ldots, \tau_{n}^{2}\right)$.
If we note $\nu:=(I-C)(\mathbf{Z}-\mu)$, conditionally specified spatial gaussian models can also be writen as
\begin{equation}
Z\left(s_{i}\right)=\mu_{i}+\sum_{j=1}^{n} c_{i j}\left(Z\left(s_{j}\right)-\mu_{j}\right)+\nu_{i}
(\#eq:cg-form-long)
\end{equation}
#### Comparison
- Equation \@ref(eq:cg-form-long) and equation \@ref(eq:sg-form-long) are comparable.
- SG and CG is equivalent if $(I-C)^{-1} M=(I-B)^{-1} \Lambda\left(I-B^{\top}\right)^{-1}$.
- Any SG can be represented as a CG, but not necessarily vice versa.
## Markov Random Fields
### Preparation
Before Markov Random Fields, some concepts and conditions need to be specified.
> **Positivity Condition** \
Suppose on each lattice node, we observe a discrete random variable $Z(s_i)$ (continuous random variables can be similarly treated). Define the domain of $Z(s_i)$ as $\zeta_{i}=\left\{z\left(s_{i}\right): \mathbf{P}\left(z\left(s_{i}\right)\right)>0\right\}$. Thus $\boldsymbol{\zeta}=\left\{\mathbf{z}=\left(z\left(s_{1}\right), \ldots, z\left(s_{n}\right)\right): \mathbf{P}(z)>0\right\}$ indicates the domain of $(Z(s_1),...,, Z(s_n))$. Positivity condition says that
\begin{equation}
\zeta=\zeta_{1} \times \cdots \times \zeta_{n}.
(\#eq:positivity-condition)
\end{equation}
In other words, positivity condition means indepence of domains in dimensions
::: {.theorem #factorization-theorem}
**[Factorization Theorem]** \
Suppose the variables $\left\{\mathrm{Z}\left(s_{i}\right): i=1, \ldots, n\right\}$ have joint probability mass function $\mathbf{P}(\mathbf{z})$, whose support $\zeta$ satisfies the positivity condition. Then,
\begin{equation}
\frac{\mathbf{P}(\mathbf{z})}{\mathbf{P}(\mathbf{y})}=\prod_{i=1}^{n} \frac{\mathbf{P}\left(z\left(s_{i}\right) \mid z\left(s_{1}\right), \cdots, z\left(s_{i-1}\right), y\left(s_{i+1}\right), \cdots, y\left(s_{n}\right)\right)}{\mathbf{P}\left(y\left(s_{i}\right) \mid z\left(s_{1}\right), \cdots, z\left(s_{i-1}\right), y\left(s_{i+1}\right), \cdots, y\left(s_{n}\right)\right)}
(\#eq:factorization-theorem)
\end{equation}
where $\mathbf{z}=\left(z\left(s_{1}\right), \cdots, z\left(s_{n}\right)\right), \mathbf{y}=\left(y\left(s_{1}\right), \ldots, y\left(s_{n}\right)\right) \in \boldsymbol{\zeta} .$
:::
There are two ways to model. If we model through joint distribution, conditional distribution is easily derived. However, if we start from conditional distribution, joint distribution may not be valid. In case of this, Factorization Theorem \@ref(eq:factorization-theorem) is proposed to eliminate invalid joint distribution.
- Positivity condition and factorization theorem also works for continuous variable.
- The ordering of the variables $\left(Z\left(s_{1}\right), \cdots, Z\left(s_{n}\right)\right)$ do not affect the left hand of equation \@ref(eq:factorization-theorem), but do affect the right hand.
> **Neighbors** \
Formal definition of neighbors is already given as \@ref(eq:neighborhoods) in section \@ref(lattices)
> **Clique** \
A clique is defined to be a set of sites that consists either of a single site or of sites that
are all neighbors of each other.
A clique can ben understood as an undirected graph.
### Markov Random Field
Having some knowledge of graphs, Markov random field can be introduced now.
> **Markov Random Field** \
Any probability measure whose conditional distributions define a neighborhood
structure ${N_i: i = 1,...,n}$ is defined to be a Markov random field.
Another important concept in Markov Random Field, which contains the same informations as $\mathbf{P}(\mathbf{z})$.
> **Negpotential Function** \
The Negpotential Function $Q$ is defined as
\begin{equation}
Q(\mathbf{z})=\log \left\{\frac{\mathbf{P}(\mathbf{z})}{\mathbf{P}(\mathbf{0})}\right\}, \quad \mathbf{z} \in \boldsymbol{\zeta}
(\#eq:neg-func)
\end{equation}
which is also called log-likelihood ratio.
Negpotential function is equivalent to knowledge of $\mathbf{P}(\mathbf{z})$ for
\begin{equation}
\mathbf{P}(\mathbf{z})=\frac{\exp (Q(\mathbf{z}))}{\sum_{\mathbf{y} \in \zeta} \exp (Q(\mathbf{y}))}
(\#eq:qp)
\end{equation}
::: {.theorem #negpotential-function}
**[Properites of the negpotential function Q]** \
\begin{equation}
\exp \left(Q(\mathbf{z})-Q\left(\mathbf{z}_{i}\right)\right)=\frac{\mathbf{P}(\mathbf{z})}{\mathbf{P}\left(\mathbf{z}_{i}\right)}=\frac{\mathbf{P}\left(z\left(s_{i}\right) \mid\left\{z\left(s_{j}\right): j \neq i\right\}\right)}{\mathbf{P}\left(0\left(s_{i}\right) \mid\left\{z\left(s_{j}\right): j \neq i\right\}\right)}
(\#eq:nf1)
\end{equation}
where $\mathbf{z}_{i}=\left(z\left(s_{1}\right), \cdots, z\left(s_{i-1}\right), 0, z\left(s_{i+1}\right), \ldots, z\left(s_{n}\right)\right)^{\prime}$ and $0\left(s_{i}\right)$ denotes the event $Z\left(s_{i}\right)=0$.
$Q$ can be expanded uniquely on $\zeta$ as
\begin{equation}
\begin{aligned}
Q(\mathbf{z})=& \sum_{1 \leq i \leq n} z\left(s_{i}\right) G_{i}\left(z\left(s_{i}\right)\right)+\sum_{1 \leq i<j \leq n} z\left(s_{i}\right) z\left(s_{j}\right) G_{i j}\left(z\left(s_{i}\right), z\left(s_{j}\right)\right)+\cdots \\
&+z\left(s_{1}\right) \cdots z\left(s_{n}\right) G_{1} \cdots n\left(z\left(s_{1}\right), \cdots, z\left(s_{n}\right)\right), \quad \mathbf{z} \in \boldsymbol{\zeta} .
\end{aligned}
(\#eq:nf2)
\end{equation}
:::
- Theorem \@ref(negpotential-function) implies that the expansion of $Q(z)$ is
actually made up of conditional probabilities.
- The pairwise interaction term in $Q(\mathbf{z})$ \@ref(eq:nf2) is:
\begin{equation}
\begin{aligned}
z &\left(s_{i}\right) z\left(s_{j}\right) G_{i j}\left(z\left(s_{i}\right), z\left(s_{j}\right)\right) \\
=& Q\left(0, \cdots, 0, z\left(s_{i}\right), 0, \cdots, 0, z\left(s_{j}\right), 0, \cdots, 0\right)-Q\left(0, \cdots, 0, z\left(s_{j}\right), 0, \cdots, 0\right) \\
&+Q(0, \cdots, 0)-Q\left(0, \cdots, 0, z\left(s_{i}\right), 0, \cdots, 0\right) \\
=& \log \left[\frac{\mathbf{P}\left(z\left(s_{i}\right) \mid z\left(s_{j}\right),\left\{0\left(s_{k}\right): k \neq i, j\right\}\right)}{\mathbf{P}\left(0\left(s_{i}\right) \mid z\left(s_{j}\right),\left\{0\left(s_{k}\right): k \neq i, j\right\}\right)} \frac{\mathbf{P}\left(0\left(s_{i}\right) \mid\left\{0\left(s_{k}\right): k \neq i\right\}\right)}{\mathbf{P}\left(z\left(s_{i}\right) \mid\left\{0\left(s_{j}\right): j \neq i\right\}\right)}\right]
\end{aligned}
(\#eq:pair-dependence)
\end{equation}
Recall that if we start from conditional distribution, joint distribution may not be valid. With some conditions, joint distribution is valid. One more concept is to introduce.
> **Well-Defined G Functions**
The consistency conditions on the conditional probabilities (needed for
reconstruction of a joint probability) can then be expressed as those conditions
needed to yield well-defined G functions.
### Hammersley-Clifford Theorem
::: {.theorem #hammersley-clifford-theorem}
**[Hammersley-Clifford Theorem]** \
Suppose that $\mathbf{Z}$ is distributed according to a Markov random field on $\boldsymbol{\zeta}$ that satisfies the positivity condition. Then, the negpotential function $Q(.)$ given by
\begin{equation}
\begin{aligned}
Q(\mathbf{z})=& \sum_{1 \leq i \leq n} z\left(\mathbf{s}_{i}\right) G_{i}\left(z\left(\mathbf{s}_{i}\right)\right)+\sum_{1 \leq i<j \leq n} z\left(\mathbf{s}_{i}\right) z\left(\mathbf{s}_{j}\right) G_{i j}\left(z\left(\mathbf{s}_{i}\right), z\left(\mathbf{s}_{j}\right)\right) \\
&+\sum_{1 \leq i<j<k \leq n} z\left(\mathbf{s}_{i}\right) z\left(\mathbf{s}_{j}\right) z\left(\mathbf{s}_{k}\right) G_{i j k}\left(z\left(\mathbf{s}_{i}\right), z\left(\mathbf{s}_{j}\right), z\left(\mathbf{s}_{k}\right)\right)+\cdots \\
&+z\left(\mathbf{s}_{1}\right) \cdots z\left(\mathbf{s}_{n}\right) G_{1 \cdots n}\left(z\left(\mathbf{s}_{1}\right), \ldots, z\left(\mathbf{s}_{n}\right)\right), \quad \mathbf{z} \in \zeta .
\end{aligned}
(\#eq:hc-theorem)
\end{equation}
must satisfy the property that if sites $i, j, \ldots, s$ do not form a clique, then $G_{i j \cdots s}(.)=0$.
:::
Hammersley-Clifford theorem indicates negpotential function $Q(.)$'s property within a clique.
### Pairwise-Only Dependence
Pairwise-only means there is no directly stacked influence between sites who is not direct neighbors to each other. For simplicity and universality, from now on we consider only exponential distribution family. If there is non priori information, the exponential distribution is a good choice.
> **Exponential Distribution** \
A random variable follows a single parameter exponential distribution if $p(x \mid \eta)=h(x) \exp \{\eta t(x)-a(\eta)\}$
Suppose the one-parameter exponential family is used to model the conditional
distribution:
\begin{equation}
\begin{aligned}
\mathbf{P}\left(z\left(s_{i}\right) \mid\left\{z\left(s_{j}\right): j \neq i\right\}\right)=& \exp \left[A_{i}\left(\left\{z\left(s_{j}\right): j \neq i\right\}\right) B_{i}\left(z\left(s_{i}\right)\right)\right.\\
&\left.+C_{i}\left(z\left(s_{i}\right)\right)+D_{i}\left(\left\{z\left(s_{j}\right): j \neq i\right\}\right)\right]
\end{aligned}
(\#eq:besag-condition)
\end{equation}
::: {.theorem #besag}
**[Besag’s Theorem]** \
Assume equation \@ref(eq:besag-condition)
and pairwise-only dependence between sites, i.e., all $G_{A}(.)=0$ for any $A$ whose number of distinct elements is 3 or more. Then
\begin{equation}
A_{i}\left(\left\{z\left(s_{j}\right): j \neq i\right\}\right)=\alpha_{i}+\sum_{j=1}^{n} \theta_{i j} B_{j}\left(z\left(s_{j}\right)\right), i=1, \cdots, n,
(\#eq:besag)
\end{equation}
where $\theta_{i j}=\theta_{j i}, \theta_{i i}=0$, and $\theta_{i k}=0$ for $k \notin N_{i}$.
:::
As a consequence of the theorem, $Q(z)$ can be expressed as
\begin{equation}
Q(\mathbf{z})=\sum_{i=1}^{n}\left\{\alpha_{i} B_{i}\left(z\left(s_{i}\right)\right)+C_{i}\left(z\left(s_{i}\right)\right)\right\}+\sum_{1 \leq i<j \leq n} \theta_{i j} B_{i}\left(z\left(s_{i}\right)\right) B_{j}\left(z\left(s_{j}\right)\right).
\end{equation}
## Conditionally Specified Spatial Models
### Models for Discrete Data
Hammersley-Clifford theorem \@ref(hammersley-clifford-theorem) is the theoretical foundation of conditionally specified spatial models for discrete data.
#### Binary Data
> **Binary Data** \
Binary data refers to data points which are either 0 or 1.
Logistic function is the first thing come to ming when it comes to 0-1 binary data. One of the most used method is known as **Autologistic Model**.
Assuming pairwise-only dependence between sites, $G_{i}(1) \equiv \alpha_{i}$, $G_{i j}(1,1) \equiv \theta_{i j}$, and $z(s_i) = 0,1$. Autologistic Model can be writen as \@ref(eq:autologistic).
\begin{equation}
\operatorname{Pr}\left(z\left(\mathbf{s}_{i}\right) \mid\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right)=\frac{\exp \left\{\alpha_{i} z\left(\mathbf{s}_{i}\right)+\sum_{j=1}^{n} \theta_{i j} z\left(\mathbf{s}_{i}\right) z\left(\mathbf{s}_{j}\right)\right\}}{1+\exp \left\{\alpha_{i}+\sum_{j=1}^{n} \theta_{i j} z\left(\mathbf{s}_{j}\right)\right\}}
(\#eq:autologistic)
\end{equation}
Autologistic model can be easily fitted by function `autologistic()` in `ngspatial` package.
As a homogeneous first-order autologistic on a countable regular latticeOne of autologistic prefered by physicist is **Ising Model**. It can ben summarized as
\begin{equation}
\operatorname{Pr}(z(u, v) \mid\{z(k, l):(k, l) \neq(u, v)\})=\exp (z(u, v) g) /\{1+\exp (g)\}
(\#eq:isling)
\end{equation}
where
\begin{equation}
\begin{aligned}
g \equiv\{\alpha&+\gamma_{1}(z(u-1, v)+z(u+1, v)) \\
&\left.+\gamma_{2}(z(u, v-1)+z(u, v+1))\right\}
\end{aligned}
\end{equation}
Other modelling types for binary data include Random Media, Multicolored Data and so on.
#### Counts Data
Besag’s theorem \@ref(thm:besag) is the theoretical foundation for counts data. When spatial data arise as counts, the natural model that comes to mind is one
based on the Poisson distribution. Assuming pairwise-only dependence between site, the auto-Poisson conditional specification is
\begin{equation}
\begin{aligned}
&\operatorname{Pr}\left(z\left(\mathrm{~s}_{i}\right) \mid\left\{z\left(\mathrm{~s}_{j}\right): j \neq i\right\}\right) \\
&\quad=\exp \left(-\lambda_{i}\left(\left\{z\left(\mathrm{~s}_{j}\right): j \neq i\right\}\right)\right)\left(\lambda_{i}\left(\left\{z\left(\mathrm{~s}_{j}\right): j \neq i\right\}\right)\right)^{z\left(\mathrm{~s}_{i}\right)} / z\left(\mathrm{~s}_{i}\right) !
\end{aligned}
(\#eq:poisson)
\end{equation}
where $\lambda_{i}=\lambda_{i}\left(\left\{z\left(\mathbf{s}_{j}\right): j \in N_{i}\right\}\right)$ is a function of data observed for the regions $N_j$ that neighbor the region $i (i = 1,..., n)$. By Besag’s theorem \@ref(thm:besag), we have
\begin{equation}
\lambda_{i}\left(\left\{z\left(\mathbf{s}_{j}\right): j \in N_{i}\right\}\right)=\exp \left\{\alpha_{i}+\sum_{j=1}^{n} \theta_{i j} z\left(\mathbf{s}_{j}\right)\right\}
(\#eq:count1)
\end{equation}
Then, $Q(z)$ is derived as
\begin{equation}
Q(\mathbf{z})=\sum_{i=1}^{n} \alpha_{i} z\left(s_{i}\right)+\sum_{1 \leq i<j \leq n} \theta_{i j} z\left(s_{i}\right) z\left(s_{j}\right)-\sum_{i=1}^{n} \log \left(z\left(s_{i}\right) !\right)
(\#eq:count2)
\end{equation}
Note that SIDS data introduced in the first part of this chapter \@ref(lattices) are suitable for auto-poisson model.
### Models for Continuous Data
#### Auto-Gaussian (or CG) models
Assuming that the conditional density has the Gaussian form
\begin{equation}
\begin{aligned}
&f\left(z\left(\mathbf{s}_{i}\right) \mid\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right) \\
&\quad=\left(2 \pi \tau_{i}^{2}\right)^{-1 / 2} \exp \left[-\left\{z\left(\mathbf{s}_{i}\right)-\theta_{i}\left(\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right)\right\}^{2} / 2 \tau_{i}^{2}\right]
\end{aligned}
(\#eq:gaussian-form)
\end{equation}
Assuming pairwise-only dependence between sites, the conditional expecta-
tion $E\left(Z\left(\mathbf{s}_{i}\right)\left\{\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right) \equiv \theta_{i}\left(\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right)\right.$ can be written as
\begin{equation}
\theta_{i}\left(\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\}\right)=\mu_{i}+\sum^{n} c_{i j}\left(z\left(\mathbf{s}_{j}\right)-\mu_{j}\right)
(\#eq:g1)
\end{equation}
The conditional distribution is given by
\begin{equation}
Z\left(\mathbf{s}_{i}\right)\left\{\left\{z\left(\mathbf{s}_{j}\right): j \neq i\right\} \sim \operatorname{Gau}\left(\mu_{i}+\sum_{j=1}^{n} c_{i j}\left(z\left(\mathbf{s}_{j}\right)-\mu_{j}\right), \tau_{i}^{2}\right)\right.
(\#eq:g2)
\end{equation}
Apply Factorization Theorem \@ref(factorization-theorem) directly, the joint distribution is as
\begin{equation}
\mathbf{Z} \sim \operatorname{Gau}\left(\boldsymbol{\mu},(I-C)^{-1} M\right)
(\#eq:g3)
\end{equation}
After the necessary derivation, we have $Q(z)$ as
\begin{equation}
\begin{aligned}
Q(\mathbf{z})=\log (f(\mathbf{z}) / f(\mathbf{0}))=&-(1 / 2)(\mathbf{z}-\boldsymbol{\mu})^{\prime} M^{-1}(I-C)(\mathbf{z}-\boldsymbol{\mu}) \\
&+(1 / 2) \boldsymbol{\mu}^{\prime} M^{-1}(I-C) \boldsymbol{\mu}
\end{aligned}
\end{equation}
## Simultaneously Specified Spatial Models
Simultaneous approach is popular in econometrics and graphical modeling. Whittle's (1954) prescription for simultaneously specified stationary processes in the plane is given by equation \@ref(eq:ss1). For a (finite) data set $\mathbf{Z} \equiv\left(Z\left(s_{1}\right), \ldots, Z\left(s_{n}\right)\right)^{\prime}$ at locations $s_{1}, \ldots, s_{n}$, the analogous specification is
\begin{equation}
(I-B) \mathrm{Z}=\epsilon
(\#eq:ss1)
\end{equation}
where $\epsilon \equiv\left(\epsilon\left(s_{1}\right), \ldots, \epsilon\left(s_{n}\right)\right)^{\prime}$ is a vector of i.i.d. zero-mean errors and $B$ is a matrix whose diagonal elements $\left\{b_{i i}\right\}$ are zero. Another way to write (6.7.1) is
\begin{equation}
Z\left(\mathbf{s}_{i}\right)=\sum_{j=1}^{n} b_{i j} Z\left(\mathbf{s}_{j}\right)+\epsilon\left(\mathbf{s}_{i}\right),
(\#eq:ss2)
\end{eq:equation}
Simultaneously specified spatial models is referred to as a spatial autoregressiue (SAR) process, which can be understood by the form of equation \@ref(eq:ss2)
A common special case occurs when $\epsilon$ is Gaussian, then we have
\begin{equation}
\mathbf{Z} \sim \operatorname{Gau}\left(\mathbf{0},(I-B)^{-1}\left(I-B^{\prime}\right)^{-1} \sigma^{2}\right)
(\#eq:ss3)
\end{equation}
### Spatial Autoregressive Regression Model
If it is desired to interpret large-scale effects $\boldsymbol{\beta}$ through $E(\mathbf{Z})=X \boldsymbol{\beta}$ (where the columns of $X$ might be treatments, spatial trends, factors, etc.), then the model should be modified to
\begin{equation}
(I-B)(\mathbf{Z}-X \boldsymbol{\beta})=\boldsymbol{\epsilon}
(\#eq:ss4)
\end{equation}
When $\epsilon$ is Gaussian,
\begin{equation}
\mathbf{Z} \sim \operatorname{Gau}\left(X \boldsymbol{\beta},(I-B)^{-1}\left(I-B^{\prime}\right)^{-1} \sigma^{2}\right)
(\#eq:ss5)
\end{equation}
## Space-Time Models
Space-Time models consider the case when not only space but also time should also be taken into acount.
### STARMA Model
the STARMA model is summarized as
\begin{equation}
\begin{aligned}
\mathbf{Z}(t) &=\sum_{k=0}^{p}\left(\sum_{j=1}^{\lambda_{k}} \xi_{k j} W_{k j}\right) \mathbf{Z}(t-k)-\sum_{l=0}^{q}\left(\sum_{j=1}^{\mu_{l}} \phi_{l j} V_{l j}\right) \boldsymbol{\epsilon}(t-l)+\boldsymbol{\epsilon}(t) \\
&=\sum_{k=0}^{p} B_{k} \mathbf{Z}(t-k)+\boldsymbol{\epsilon}(t)-\sum_{l=0}^{q} E_{l} \boldsymbol{\epsilon}(t-l)
\end{aligned}
(\#eq:starma)
\end{equation}
where
- $W_{k j}$ and $V_{l j}$: given weight matrices,
- $\lambda_{k}$: the extent of the spatial lagging on the autoregressive component,
- $\mu_{l}$: the extent of the spatial lagging on the moving-average component,
- $\left\{\xi_{k j}\right\}$ and $\left\{\phi_{l j}\right\}$: the STARMA parameters to be estimated (restrictions are needed on the $W$ s and $V$ s to ensure these parameters are identifiable).
- $B_{0}, E_{0}$: necessarily have zeros down their diagonals.