-
Notifications
You must be signed in to change notification settings - Fork 1
/
pcorr-results-theory.tex
70 lines (59 loc) · 7.49 KB
/
pcorr-results-theory.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
\section*{Results}
\subsection*{Covariance estimation}
The covariance matrix is defined as
\begin{equation}\label{eq:true-covariance}
\Sigma = \E{(x-\mu)(x-\mu)^\T},\quad \mu = \E{x}
\end{equation}
where $\E{\cdot}$ denotes expectation; the $p\times 1$ vector $x$ is a single observation of the firing rates of $p$ neurons in a time bin of some duration; and $\mu$ is the vector of expected firing rates.
Given a set of observations $\{x(t): t\in T$\} of population activity, where $x(t)$ is a $p\times 1$ vector of firing rates in time bin $t$, and an unbiased estimate $\bar x$ of the mean activity, the \emph{sample covariance matrix} is
\begin{equation}\label{eq:sample}
C_{\sf sample} = \frac 1 n \sum\limits_{t\in T} (x(t)-\bar x)(x(t)-\bar x)^\T,
\end{equation}
where $n$ is the number of time bins in $T$.
The sample covariance matrix is an unbiased estimate of the covariance matrix ($\E{C_{\sf sample}}=\Sigma$) provided that the estimate of the mean, $\bar x$, is known or is already estimated from an independent dataset.
When the mean is estimated from the same sample, $C_{\sf sample}$ becomes biased toward zero. However, in all the cases where unbiasedness is required, we will estimate the mean from a separate sample.
Given any covariance matrix estimate $C$, the corresponding correlation matrix $R$ is calculated by normalizing the rows and columns of $C$ by the square roots of its diagonal elements to produce unit entries on the diagonal:
\begin{equation}\label{eq:precision}
R = (\mbox{diag}(C))^{-\frac 1 2} C (\mbox{diag}(C))^{-\frac 1 2},
\end{equation}
where $\mbox{diag}(C)$ denotes the diagonal matrix with the diagonal elements from $C$.
The \emph{partial correlation} between a pair of variables is the Pearson correlation coefficient of the residuals of the linear least-squares predictor of their activity based on all the other variables, excluding the pair \citep{Anderson:2003, Whittaker:1990}. Partial correlations figure prominently in probabilistic \emph{graphical modeling} wherein the joint distribution is explained by sets of two-way interactions \citep{Whittaker:1990}. For the multivariate Gaussian distribution, zero partial correlations indicate conditional independence of the pair, implying a lack of direct interaction \citep{Dempster:1972, Whittaker:1990}. More generally, partial correlations can serve as a measure of conditional independence under the assumption that most dependencies in the system include strong linear effects \citep{Whittaker:1990,Baba:2004}. As neural recordings become increasingly dense, partial correlations may prove useful as indicators of conditional independence (lack of functional connectivity) between pairs of neurons.
Pairwise partial correlations are closely related to the elements of the \emph{precision matrix}, \emph{i.e.}\;the inverse of the covariance matrix \citep{Dempster:1972,Whittaker:1990}. Zero elements in the precision matrix signify zero partial correlation between the two variables. Given the covariance estimate $C$, the matrix of partial correlations $P$ is computed by normalizing the rows and columns of the \emph{precision matrix} $C^{-1}$ to produce negative unit entries on the diagonal:
\begin{equation}\label{eq:partial}
P = -\left(\mbox{diag}(C^{-1})\right)^{-\frac 1 2} C^{-1} \left(\mbox{diag}(C^{-1})\right)^{-\frac 1 2}
\end{equation}
Increasing the number of recorded neurons results in a higher \emph{condition number} of the covariance matrix \cite{Ledoit:2004} making the partial correlation estimates \emph{ill-conditioned}: small errors in the covariance estimates translate into much greater errors in the estimates of partial correlations. With massively multineuronal recordings, partial correlations cannot be estimated without \emph{regularization} \cite{Ledoit:2004,Schafer:2005}.
We considered four regularized estimators based on distinct families of target estimates: $C_{\sf diag}$, $C_{\sf factor}$, $C_{\sf sparse}$, and $C_{\sf sparse+latent}$. In probabilistic models with exclusively linear dependencies, the target estimates of these estimators correspond to distinct families of graphical models (Fig.~\ref{fig:1} Row 1).
\input{fig-pcorr-1.tex} %%%%%% FIGURE 1 from the paper
The target estimate of estimator $C_{\sf diag}$ is the diagonal matrix $D$ containing estimates of neurons' variances. Regularization is achieved by linear \emph{shrinkage} of the sample covariance matrix toward $D$, controlled by the scalar \emph{shrinkage intensity} parameter $\lambda \in [0, 1]$:
\begin{equation}\label{eq:c-diag}
C_{\sf diag} = (1-\lambda) C_{\sf sample} + \lambda D
\end{equation}
The structure imposed by $C_{\sf diag}$ favors (performs better with) populations with no linear associations between the neurons (Fig.~\ref{fig:1} Row 1, A). If sample correlations are largely spurious, $C_{\sf diag}$ is expected to be more efficient than other estimators.
Estimator $C_{\sf factor}$ approximates the covariance matrix by the factor model,
\begin{equation}\label{eq:c-factor}
C_{\sf factor} = L + D,
\end{equation}
where $L$ is a $p\times p$ symmetric positive semidefinite matrix with low rank and $D$ is a diagonal matrix.
This approximation is at the basis of \emph{factor analysis} \citep{Anderson:2003}.
In factor analysis, matrix $L$ represets covariances arising from latent factors with the rank of $L$ indicating the number of such latent variables.
Matrix $D$ contains the variances of the independent noise in the observed variables.
The estimator is regularized by the selection of the rank of $L$.
The structure imposed by $C_{\sf factor}$ favors conditions in which the population activity is linearly driven by a number of latent factors that affect many cells while direct interactions between the recorded cells are insignificant (Fig.~\ref{fig:1} Row 1, B).
Estimator $C_{\sf sparse}$ is produced by approximating the precision matrix by a sparse matrix $S$:
\begin{equation}\label{eq:c-sparse}
C_{\sf sparse} = S^{-1}.
\end{equation}
where $S$ is a sparse matrix, \ie many of its off-diagonal elements are forced to zero. The estimator is regularized by adjusting the sparsity (fraction of off-diagonal zeros) in $S$.
Zeros in the precision matrix coincide with zero partial correlations between the same pairs of variables.
Under the assumption of linearity of interactions in the system, zero partial correlation indicates conditional independence or lack of direct interaction between the pair of variables.
Therefore, the structure imposed by $C_{\sf sparse}$ favors conditions in which neural correlations arise from direct linear effects (`interactions') between some pairs of the observed neurons (Fig.~\ref{fig:1}} Row 1, C).
The problem of finding the optimal set of non-zero elements of the precision matrix is known as \emph{covariance selection} \citep{Dempster:1972}.
Estimator $C_{\sf sparse+latent}$ is obtained by approximating the precision matrix as the sum of a sparse component and a low-rank component:
\begin{equation}\label{eq:c-sl}
C_{\sf sparse+latent} = (S - L)^{-1},
\end{equation}
where $S$ is a sparse matrix and $L$ is a low-rank symmetric positive semidefinite matrix.
The estimator is regularized by adjusting the sparsity of $S$ (sparsity of local interactions) and the rank of $L$ (number of latent units)
See Methods for a more detailed explanation.
The structure imposed by $C_{\sf sparse+latent}$ favors conditions in which the activity of neurons is determined by linear effects between some observed pairs of neurons and some from several latent units (Fig.~\ref{fig:1} Row 1, D) \citep{Chandrasekaran:2010,Ma:2013}.