pcorr-results-empirical.tex

\subsection*{The $C_{\sf sparse+latent}$ estimator is most efficient in neural data}
\input{./fig-pcorr-3.tex}   %%  Figure 3 from paper

We recorded the calcium activity of densely sampled populations of neurons in layers 2/3 and upper layer 4 in primary visual cortex of sedated mice using fast random-access 3D scanning two-photon microscopy during visual stimulation (Fig.~\ref{fig:3} A--B) \citep{Reddy:2005, Katona:2012, Cotton:2013}. This technique allowed fast sampling (100--150 Hz) from large numbers (150--350) of cells in a small volume of cortical tissue ($200\times200\times100$ $\mu$m$^3$) in layers 2/3 and 4 (Fig.~\ref{fig:3} C and D).  The firing rates were inferred using sparse nonnegative deconvolution \citep{Vogelstein:2010} (Fig.~\ref{fig:3} C). Only cells that produced detectable calcium activity were included in the analysis (see Methods).  First, 30 repetitions of full-field drifting gratings of 16 directions were presented in random order.  Each grating was displayed for 500 ms, without intervening blanks.  This stimulus was used to compute the orientation tuning of the recorded cells (Fig.~\ref{fig:3} D). To estimate the noise correlation matrix, we presented only two distinct directions in some experiments or five directions in others with 100--300 repetitions of each direction. Each grating lasted 1 second and was followed by a 1-second blank.  The traces were then binned into 150 ms intervals aligned on the stimulus onset for the estimation of the correlation matrix.   The sample correlation coefficients were largely positive and low (Fig.~\ref{fig:3} E and F). The average value of the correlation coefficient across sites ranged from 0.0065 to 0.051 with the mean across sites of 0.018 (Fig.~\ref{fig:6} D).

In these densely sampled populations, direct interactions between cells are likely to influence the patterns of population activity.  We therefore hypothesized that covariance matrix estimators that explicitly modeled the partial correlations between pairs of neurons ($C_{\sf sparse}$ and $C_{\sf sparse+latent}$) would have a performance advantage.  However, the observed neurons must also be strongly influenced by global activity fluctuations and by unobserved common inputs to the advantage of estimators that explicitly model common fluctuations of the entire population: $C_{\sf factor}$ and $C_{\sf sparse+latent}$.  If both types of effects are significant, then $C_{\sf sparse+latent}$ should outperform the other estimators.

\input{./fig-pcorr-4.tex}   %%  Figure 4 from paper

\input{./fig-pcorr-S1.tex}   %% Figure S1 from paper

To test this hypothesis, we computed the relative validation loss of estimators  $C_{\sf sample}$, $C_{\sf diag}$, $C_{\sf factor}$, and $C_{\sf sparse}$ with respect to $C_{\sf sparse+latent}$ in $n=27$ imaged sites in 14 mice.  The hyperparameters of each estimator were optimized by nested cross-validation (See Fig.~\ref{fig:S1} and  Methods). Indeed, the sparse+latent estimator outperformed the other estimators (Fig.~\ref{fig:4}). The respective median differences of the validation loss were 0.039, 0.0016, 0.0029, and 0.0059 nats/cell/bin, significantly greater than zero ($p<0.01$ in each comparison, $n=27$ sites in 14 mice, Wilcoxon signed rank test).

\subsection*{Structure of $C_{\sf sparse+latent}$ estimates}

\input{./fig-pcorr-5.tex}   %%  Figure 5 from paper

\input{./fig-pcorr-6.tex}   %%  Figure 6 from paper


We examined the composition of the $C_{\sf sparse+latent}$ estimates at each imaged site (Fig.~\ref{fig:5} and Fig.~\ref{fig:6}). Although the regularized estimates were similar to the sample correlation matrix (Fig.~\ref{fig:5} A and B), the corresponding partial correlation matrices differed substantially (Fig.~\ref{fig:5} C and D). The estimates separated two sources of correlations: a network of linear interactions expressed by the sparse component of the inverse and latent units expressed by the low-rank components of the inverse (Fig.~\ref{fig:5} E). The sparse partial correlations revealed a network that differed substantially from the network composed of the greatest coefficients in the sample correlation matrix (Fig.~\ref{fig:5} F, G, H, and I).

In the example site (Fig.~\ref{fig:5}), the sparse component had 92.8\% sparsity (or conversely, 7.2\% connectivity: $\mbox{connectivity}=1-\mbox{sparsity}$) with average node degree of 20.9 (Fig.~\ref{fig:5} G). The average node degree, \emph{i.e.}\;the average number of interactions linking each neuron, is related to connectivity as $\mbox{degree} = \mbox{connectivity}\cdot(p-1)$, where $p$ is the number of neurons. The low-rank component had rank 72, denoting 72 inferred latent units. The number of latent units increased with population size (Fig.~\ref{fig:6} A) but the connectivity was highly variable (Fig.~\ref{fig:6} B): Several sites, despite their large population sizes, were driven by latent units and had few pairwise interactions. This variability may be explained by differences in brain states and recording quality and warrants further investigation.

The average partial correlations calculated from these estimates according to Eq.~\ref{eq:partial} at all 27 sites were about 5 times lower than the average sample correlations (Fig.~\ref{fig:6} C). This suggests that correlations between neurons build up from multiple chains of smaller interactions. Furthermore, the average partial correlations were less variable: the coefficient of variation of the average sample correlations across sites was 0.45 whereas that of the average partial correlations was 0.29, with larger populations exhibiting greater uniformity of average partial correlations than the smaller populations ($p=0.002$ Brown-Forsythe test).

While the sample correlations were mostly positive, the sparse component of the partial correlations (`interactions') had a high fraction (28.7\% in the example site) of negative values (Fig.~\ref{fig:5} F). The fraction of negative interactions increased with the inferred connectivity (Fig.~\ref{fig:6} D), suggesting that negative interactions can be inferred only after a sufficient density of positive interactions has been uncovered.

Thresholded sample correlations have been used in several studies to infer pairwise interactions \citep{Golshani:2009, Feldt:2011, Malmersjo:2013, Sadovsky:2014}.  We therefore compared the interactions in the sparse component of $C_{\sf sparse+latent}$ to those obtained from the sample correlations thresholded to the same level of connectivity. The networks revealed by the two methods differed substantially. In the example site with 7.2\% connectivity in $C_{\sf sparse+latent}$, only 27.7\% of the connections coincided with the above-threshold sample correlations (Fig.~\ref{fig:5} F, H, and I). In particular, most of the inferred negative interactions corresponded to low sample correlations (Fig.~\ref{fig:5} F) where high correlations should be expected given the rest of the correlation matrix.

\subsection*{Relationship of $C_{\sf sparse+latent}$ to orientation tuning and physical distances}

\input{./fig-pcorr-7.tex}  %%  Figure 7 from paper

We examined how the structure of the $C_{\sf sparse+latent}$ estimates related to the differences in orientation preference and to the physical distances separating pairs of cells (Fig.\;\ref{fig:7}).  Five sites with highest pairwise connectivities were included in the analysis. Partial correlations were computed using Eq.~\ref{eq:partial} based on the regularized estimate, including both the sparse and the latent component. Connectivity was computed as the fraction of pairs of cells connected by non-zero elements (interactions) in the sparse component of the estimate, distinguishing between the positive and negative connectivities.

First, we analyzed how correlations and connectivity depended on the difference in preferred orientations ($\Delta \mbox{ori}$) of pairs of significantly ($\alpha=0.05$) tuned cells. The partial correlations decayed more rapidly with $\Delta\mbox{ori}$ than did sample correlations ($p<10^{-9}$ in each of the five sites, two-sample $t$-test of the difference of the linear regression coefficients). Positive connectivity decreased with $\Delta\mbox{ori}$ ($p<0.005$ in each of the five sites, $t$-test on the logistic regression coefficient) whereas negative connectivity did not decrease (Fig.~\ref{fig:7} D): The slope in the logistic model of connectivity with respect to $\Delta\mbox{ori}$ was significantly higher for positive than for negative interactions ($p<0.04$ in each of the five sites, two-sample $t$-test of the difference of the logistic regression coefficient).

Second, we compared how correlations and connectivity depended on the physical distance separating pairs of cells. We distinguished between lateral distance, $\Delta x$, in the plane parallel to the pia, and vertical distance, $\Delta z$, orthogonal to the pia.  When considering the dependence on $\Delta x$, the analysis was limited to cell pairs located at the same depth with $\Delta z < 30\,\mu\mbox{m}$; conversely, when considering the dependence on $\Delta z$, only vertically aligned cell pairs with $\Delta x < 30\,\mu\mbox{m}$ were included. Again, the partial correlations decayed more rapidly both laterally and vertically than sample correlations ($p<10^{-6}$ in each of the five sites, for both lateral and vertical distances, two-sample $t$-test of the difference of the linear regression coefficients).
Positive connectivity decayed with distance ($p<10^{-6}$ in each of the five sites for positive interactions and $p<0.05$ for negative interactions, $t$-test on the logistic regression coefficient) (Fig.~\ref{fig:7} E), so that cells separated laterally by less than 25 $\mu\mbox{m}$ were 3.2 times more likely to be connected than cells separated laterally by more than 150 $\mu\mbox{m}$. Although the positive connectivity appeared to decay faster with vertical than with lateral distance, the differences in slopes of the respective logistic regression models were not significant with available data. The negative connectivity decayed slower with distance (Fig.~\ref{fig:7} E and F): The slope in the respective logistic models with respect to the lateral distance was significantly higher for positive than for negative connectivities ($p<0.05$ in each of the five sites, two-sample $t$-test of the difference of the logistic regression coefficients).