pcorr-discussion.tex

\section{Discussion}
\subsection*{Functional connectivity as a network of pairwise interactions}
Functional connectivity is often represented as a graph of pairwise interactions. The goal of many studies of functional connectivity has been to estimate  anatomical connectivity from  observed multineuronal spiking activity.  For example, characteristic peaks and troughs in the pairwise cross-correlograms of recorded spike trains contain statistical signatures of directional monosynaptic connections and shared synaptic inputs \citep{Gerstein:1964, Perkel:1967, Moore:1970, Alonso:1998, Denman:2013}.  Such signatures are ambiguous as they can arise from network effects other than direct synaptic connections \citep{Aertsen:1989}.  With simultaneous recordings from more neurons, ambiguities can be resolved by inferring the conditional dependencies between pairs of neurons.  Direct causal interactions between neurons produce statistical dependency between them even after conditioning on the state of the remainder of the network and external input. Therefore, conditional independence can signify the absence of a direct causal influence.

Conditional dependencies can be inferred by fitting a probabilistic model of the joint population activity. For example, generalized linear models (GLMs) have been constructed to  include biophysically plausible synaptic integration, membrane kinetics, and individual neurons' stimulus drive~\citep{Pillow:2008}.  Maximum entropy models constrained by observed pairwise correlations are among other models with pairwise coupling between cells \citep{Schneidman:2006, Tkacik:2006, Yu:2008, Tang:2008, Shlens:2009}.  Assuming that the population response follows a multivariate normal distribution, the conditional dependencies between pairs of neurons are expressed by the partial correlations between them.   Each probabilistic model, fitted to the same data may reveal a completely different network of `interactions',  \emph{i.e.}\;conditional dependencies between pairs of cells.

It is not yet clear which approach provides the best correspondence with anatomical connectivity. Little experimental evidence is available to answer this question.  The connectivity graphs inferred by various statistical methods are commonly reported without examining their relation to anatomy.
Topological properties of such graphs have been interpreted as principles of circuit organization (\emph{e.g.} small-world organization) \citep{Feldt:2011, Yu:2008, Malmersjo:2013, Sadovsky:2014}.  However, the topological properties of functional connectivity graphs can depend on the method of inference \citep{Zalesky:2012}. Until a physiological interpretation of functional connectivity is established, the physiological relevance of such analyses remains in question and we did not attempt graph-theoretical analyses on inferred sparse networks of interactions.

Inference of the conditional dependencies also depends on the completeness of the recorded population:  To equate conditional dependency to direct interaction between two neurons, we must record from all neurons with which the pair interacts. Unobserved portions of the circuit may manifest as conditional dependencies between observed neurons that do not interact. For this reason, statistical models of population activity have been most successfully applied to \emph{in vitro} preparations of the retina or cell cultures where high-quality recordings from the complete populations were available \citep{Pillow:2008}. In cortical tissue, electrode arrays record from a small fraction of cells in a given volume, limiting the validity of inference of the pairwise conditional dependencies. Perhaps for this reason, partial correlations have not, until now, been used to describe the functional connectivity in cortical populations.

Two-photon imaging of population calcium signals presents unique advantages for the estimation of functional connectivity.  While the temporal resolution of calcium signals is limited by calcium dye kinetics, fast imaging techniques combined with spike inference algorithms provide millisecond-scale temporal resolution of single action potentials \citep{Grewe:2010}. However, such high temporal precision comes at the cost of the accuracy of inferred spike rates.  Better accuracy is achieved when calcium signals are analyzed on scales of tens of milliseconds \citep{Cotton:2013, Theis:2014}.  The major advantage of calcium imaging is its ability to characterize the spatial arrangement and types of recorded cells.  Recently, advanced imaging techniques have allowed recording from nearly every cell in a volume of cortical tissue  \emph{in vivo} \citep{Katona:2012, Cotton:2013} and even from entire nervous systems \citep{Leung:2013, Ahrens:2013}.  These techniques may provide more incisive measurements of functional connectivity than electrophysiological recordings.

The low temporal resolution of calcium signals limits the use of functional connectivity methods that rely on millisecond-scale binning of signals (cross-correlograms, some GLMs, and binary maximum entropy models).  Hence, most studies of functional connectivity have relied on instantaneous sample correlations \citep{Greenberg:2008, Golshani:2009, Hofer:2011, Malmersjo:2013} .  Although some investigators have interpreted such correlations as indicators of (chemical or electrical) synaptic connectivity, most used them as more general indicators of functional connectivity without relating them to underlying mechanisms.

In this study, we sought to infer pairwise functional connectivity networks  in cortical microcircuits. 
We hypothesized that partial correlations correspond more closely to underlying mechanisms than sample correlations when recordings are sufficiently dense.  
Since neurons form synaptic connections mostly locally and sparsely \citep{Perin:2011}, we \emph{a priori} favored solutions with sparse partial correlations.  
Under the assumptions that the recorded population is sufficiently complete and that the model correctly represents the nature of interactions, the network of partial correlations can be hypothesized to be a better representation of functional dependencies than correlations.

\subsection*{Functional connectivity as coactivations}
Another approach to describing the functional connectivity of a circuit is to isolate individual patterns of multineuronal coactivations \citep{Gerstein:1989, Chapin:1999, Peyrache:2010, Ch:2010, Lopes:2011, Lopes:2013}. Depending on the method of their extraction, coactivation patterns may be referred to as \emph{assemblies}, \emph{factor loadings}, \emph{principal components}, \emph{independent components}, \emph{activity modes}, \emph{eigenvectors}, or \emph{coactivation maps}. Coactivation patterns could be interpreted as signatures of Hebbian cell assemblies \citep{Gerstein:1989, Ch:2010}, \emph{i.e.}\;groups of tightly interconnected groups of cells involved in a common computation.  Coactivation patterns could also result from shared input from unobserved parts of the circuit, or global network fluctuations modulating the activity of the local circuit \citep{Okun:2012, Ecker:2014}.

Coactivation patterns and pairwise connectivity are not mutually exclusive since assemblies arise from patterns of synaptic connectivity.  However, an analysis of coactivation shifts the focus from detailed interactions to  collective behavior.
In our study, the functional connectivity solely through modes of coactivations was represented by the factor analysis-based estimator $C_{\sf factor}$.

\subsection*{Combining pairwise interactions and coactivations}
In the effort to account for the joint activity patterns that are poorly explained by pairwise interactions, investigators have augmented models of pairwise interactions with additional mechanisms such as latent variables, higher-order correlations, or global network fluctuations \citep{Ganmor:2011, Tkacik:2013, Pfau:2013, Koster:2013, Ecker:2014}.

In our study, we combined pairwise interactions with collective coactivations by applying the recently developed numerical techniques for the inference of the partial correlation structure in systems with latent variables \citep{Chandrasekaran:2010, Ma:2013}.  The resulting estimator, $C_{\sf sparse+latent}$, effectively decomposed the functional connectivity into a sparse network of pairwise interactions and coactivation mode vectors.

\subsection*{Addressing ill-posedness}
Inferring the conditional dependencies between variables in a probabilistic model is an ill-posed problem: small variations in the data produce large errors in the inferred network of dependencies. The problem becomes worse as the number of  recorded neurons increases until such models lose their statistical validity \citep{Roudi:2009}.  As techniques have improved to allow recording from larger neuronal populations, experimental neuroscientists have addressed this problem by extending the recording durations to keep sampling noise in check and verified that existing models are not overfitted \citep{Tkacik:2013}. However, ambitious projects, such as the BRAIN initiative  \citep{Alivisatos:2013}, aim to record from significantly larger populations. Simply increasing recording duration will not be practical or sufficient, and the problem must be addressed by using regularized estimators. Regularization biases the solution toward a small subspace in order to counteract the effect of  sampling noise in the empirical data. However, biasing the solution to an inappropriate subspace does not allow significant estimation improvement and hinders interpretation.

Several strategies have been developed to limit the model space in order to improve the quality of the estimate. For example, Ganmor et al. \citep{Ganmor:2011} developed a heuristic rule to identify the most significant features that must be fitted by a maximum entropy model for improved performance in the retina. As another example of regularization, generalized linear models typically employ $L_1$ penalty terms to constrain the solution space and to effectively reduce the dimensionality of the solution \citep{Pillow:2008}.

In our study, regularizations were accomplished by dimensionality reduction (feature selection) schemes to produce sparse, constrained solutions. Only the most efficient scheme was considered in the analysis of functional connectivity.

\subsection*{Model selection}
Various model selection criteria have been devised to select between families of models and the optimal subsets of variables in a given model family based on observed data. Despite its computational requirements, cross-validation is among the most popular model selection methods due to its minimal assumptions about the data generating process \citep{Arlot:2010}.

We evaluated the covariance matrix estimators using a loss function derived from the normal distribution.  However, this does not limit the applicability of its conclusions to normal distributions. Other probabilistic models, fitted to the same data, could also serve as estimators of the covariance matrix.  If a different model yields better estimation of the covariance matrix than the estimator proposed here, we believe that its structure should deserve consideration as the better representation of the functional connectivity.

The results of model selection must be interpreted with caution.  As we demonstrated by simulation, even models with incorrect forms of dependencies can substantially improve estimates (Fig.~\ref{fig:1}). Therefore, showing that a more constrained model has better cross-validated performance than a more complex model does not necessarily support the conclusion that it reveals a better representation of dependencies in the data.  This caveat is related to \emph{Stein's Paradox} \citep{Efron:1977}: The biasing of an estimate toward an arbitrary low-dimensional target can consistently outperform a less constrained estimate.

\subsection*{Physiological interpretation and future directions}

We showed that among several models a sparse network of linear interactions with several latent inputs yielded the best estimates of the noise covariance matrix for cortical microcircuits.  This finding is valuable in itself: improved estimates of the noise covariance matrix for large datasets are important in order to understand the role of noise correlations in population coding \citep{Abbott:1999, Sompolinsky:2001, Averbeck:2006, Josic:2009, Ecker:2011}

Moreover, this estimation approach provides a graphical representation of the dependencies in the data that can be used to formulate and test hypotheses about the structure of connectivity in the microcircuit. Importantly, the inferred functional interactions differed substantially from the network of the most significant correlations.  
For example, the $C_{\sf sparse+latent}$ estimator reveals a large number of negative interactions that were not present in the sample correlation matrix (Fig.~\ref{fig:5} F) and may reflect inhibitory circuitry.

Distances between cells in physical space and in sensory feature space had a stronger effect on the partial correlations estimated by the $C_{\sf sparse+latent}$ estimator than on sample correlations (Fig.~\ref{fig:7} A--C).
These differences support the idea that correlations are built up from partial correlations in chains of intermediate cells positioned closer and tuned more similarly to one another, with potentially closer correspondence to anatomical connectivity.  These differences may also be at least partially explained by a trivial effect of regularization: the $L_1$ penalty applied by the estimator (Eq.~\ref{eq:ma}) suppresses small partial correlations to greater extent than large partial correlations, enhancing the apparent effect of distance and tuning.  
Still, the distinct positive and negative connectivity patterns (Fig.\ \ref{fig:7} D--F)  may reflect geometric and graphical features of local excitatory and inhibitory networks. 
Indeed, the relationships between patterns of positive and negative connectivities inferred by the estimator resembled the properties of excitatory and inhibitory synaptic connectivities with respect to distance, cortical layers, and feature tuning \citep{Song:2005, Oswald:2008, Adesnik:2010, Perin:2011, Fino:2011, Hofer:2011, Isaacson:2011, Levy:2012}. For example, while excitatory neurons form synapses within highly specific local cliques \citep{Perin:2011}, inhibitory interneurons form synapses with nearly all excitatory cells within local microcircuits \citep{Fino:2011, Hofer:2011, Packer:2011}.  To further investigate the link between synaptic connectivity and inferred functional connectivity, in future experiments, we will use molecular markers for various cell types with follow-up multiple whole-cell \emph{in vitro} recordings \citep{Hofer:2011, Ko:2013} to directly compare the inferred functional connectivity graphs to the underlying anatomical circuitry. Finally, the latent units inferred by the estimator can be analyzed for their physiological functions. For example, these latent units may be modulated under different brain states (e.g. slow-wave sleep, attention) and stimulus conditions (e.g. certain types of stimuli may engage feedback connections) \citep{Reimer:2014,Fu:2014}.