Experimental!!!
The R package OverICA
performs overcomplete independent component analysis (ICA). Given n observations of p variables stored in matrix
where
where
We opt for a distributional approach, finding a model that matches the emperical and model implied moments.
Our implementation of ICA estimation requires we:
- Matching the moments (or distributions) of model-implied data with observed data
- Ensuring the latent variables remain independent
- Learning the shape of non-Gaussian latent distributions
To avoid parametric assumptions about latent distributions, we use neural networks as flexible transformations:
- Sample latent variables
$z$ from standard normal distributions - Transform each
$z_i$ through a neural network to create non-Gaussian latent variable$x_i$ - Mix transformed variables using matrix
$A$ to create observed variables$y$ - Compare distributional properties between generated and observed data
We implement two approaches for distribution matching:
-
Empirical Cumulative Generating Function (ECGF):
- Captures distributional information through generalized covariance matrices
- Computationally efficient: uses only 2nd order statistics
- Avoids explicit computation of higher-order moments
- Based on Podosinnikova et al. (2019)
-
Higher-Order Moments:
- Explicitly computes and matches moments up to 4th order
- Includes covariance (2nd), skewness (3rd), and kurtosis (4th)
- More computationally intensive but potentially more precise
install.packages(c("torch", "clue", "MASS", "devtools"))
devtools::install_github("MichelNivard/OverICA")
library(OverICA)
# ECGF-based estimation
result <- overica(
data = data,
k = k,
n_batch = 4096,
num_t_vals = 12, # Number of t-values for ECGF
tbound = 0.2, # Bounds for t-values
lambda = 0, # L1 regularization
sigma = 3, # Covariance penalty
hidden_size = 10,
use_adam = TRUE,
adam_epochs = 8000,
adam_lr = 0.1,
use_lbfgs = FALSE,
lbfgs_epochs = 45,
lr_decay = 0.999
)
# Moment-based estimation
result <- overica_sem_full(
data = data,
k = k,
moment_func = compute_central_moments, # Moment computation function
third = TRUE, # Include 3rd order moments
error_cov = NULL, # Known error covariance
maskB = NULL, # Structure constraints on B
maskA = NULL, # Structure constraints on A
lambdaA = 0.01, # L1 penalty on A
lambdaB = 0.00, # L1 penalty on B
sigma = 0.01 # Covariance penalty
)
For zero-mean variables, central moments are defined as:
Second Order (Covariance):
Third Order (Skewness):
Fourth Order (Kurtosis):
The model implied moments, and the emperical moments are used to compute the loss. The model is optimized to miimize the loss.
For a random vector
The derivatives of
- First derivative: Generalized mean
- Second derivative: Generalized covariance
In OverICA
, we:
- Evaluate the empirical CGF at multiple points
$t$ - Match the generalized covariances between model and data
- Use stochastic optimization to avoid overfitting to specific
$t$ values
- Efficient computation using unique moment combinations
- Batch processing with torch tensors
- Optional structural constraints via mask matrices
- L1 penalties for sparse solutions
- Multiple optimization runs for stability
Podosinnikova, A., Perry, A., Wein, A. S., Bach, F., d'Aspremont, A., & Sontag, D. (2019). Overcomplete independent component analysis via SDP. In The 22nd international conference on artificial intelligence and statistics (pp. 2583-2592). PMLR.
Ding, C., Gong, M., Zhang, K., & Tao, D. (2019). Likelihood-free overcomplete ICA and applications in causal discovery. Advances in neural information processing systems, 32.
- Likelihood Free Overcomplete ICA
- [https://github.com/gilgarmish/oica] Overcomplete ICA trough convex optimisation