coroICA {coroICA}R Documentation

coroICA

Description

Estimates the unmixing matrix V=A^-1 of a confounded ICA model of the form X=AS+H, where H is confounding noise which is group-wise stationary and S are non-stationary signal sources. The function can also be used without a group-structure (i.e., using a single group) in which it corresponds to a noisy ICA that allows for arbitrary stationary noise H.

Usage

coroICA(X, group_index = NA, partition_index = NA, n_components = NA,
  n_components_uwedge = NA, rank_components = FALSE,
  pairing = "complement", max_matrices = 1, groupsize = 1,
  partitionsize = NA, timelags = NA, instantcov = TRUE,
  max_iter = 1000, tol = 1e-12, minimize_loss = FALSE,
  condition_threshold = NA, silent = TRUE)

Arguments

X

data matrix. Each column corresponds to one predictor variable.

group_index

vector coding to which group each sample belongs, with length(group_index)=nrow(X). If no group index is provided a rigid grid with groupsize samples per group is used (which defaults to all samples if groupsize was not set).

partition_index

vector coding to which partition each sample belongs, with length(partition_index)=nrow(X). If no partition index is provided a rigid grid with partitionsize samples per partition is used.

n_components

number of components to extract. If NA is passed, the same number of components as the input has dimensions is used.

n_components_uwedge

number of components to extract during uwedge approximate joint diagonalization of the matrices. If NA is passed, the same number of components as the input has dimensions is used.

rank_components

boolean, optional. When TRUE, the components will be ordered in decreasing stability.

pairing

either 'complement', 'neighbouring' or 'allpairs'. If 'allpairs' the difference matrices are computed for all pairs of partition covariance matrices, if 'complement' a one-vs-complement scheme is used and if 'neighbouring' differences with the right neighbour parition are used.

max_matrices

float or 'no_partitions', optional (default=1). The fraction of (lagged) covariance matrices to use during training or, if 'no_partitions', at most as many covariance matrices are used as there are partitions.

groupsize

int, optional. Approximate number of samples in each group when using a rigid grid as groups. If NA is passed, all samples will be in one group unless group_index is passed during fitting in which case the provided group index is used (the latter is the advised and preferred way).

partitionsize

int or vector of ints, optional. Approximate number of samples in each partition when using a rigid grid as partition. If NA is passed, a (hopefully sane) default is used, again, unless partition_index is passed during fitting in which case the provided partition index is used. If a vector is passed, each element is used to construct a grid and all resulting partitions are used.

timelags

vector of ints, optional. Specifies which timelags should be included. 0 correpsonds to covariance matrix.

instantcov

boolean, default TRUE. Specifies whether to include covariance matrix when timelags are used.

max_iter

int, optional. Maximum number of iterations for the uwedge approximate joint diagonalisation during fitting.

tol

float, optional. Tolerance for terminating the uwedge approximate joint diagonalisation during fitting.

minimize_loss

boolean, optional. Parameter is passed to uwedge and specifies whether to compute loss function in each iteration step of uwedge.

condition_threshold

float, optional. Parameter is passed to uwedge and specifies whether and at which threshold to terminate uwedge iteration depending on the condition number of the unmixing matrix.

silent

boolean whether to supress status outputs.

Details

For further details see the references.

Value

object of class 'CoroICA' consisting of the following elements

V

the unmixing matrix.

coverged

boolean indicating whether the approximate joint diagonalisation converged due to tol.

n_iter

number of iterations of the approximate joint diagonalisation.

meanoffdiag

mean absolute value of the off-diagonal values of the to be jointly diagonalised matrices, i.e., a proxy of the approximate joint diagonalisation objective function.

Author(s)

Niklas Pfister and Sebastian Weichwald

References

Pfister, N., S. Weichwald, P. Bühlmann and B. Schölkopf (2018). Robustifying Independent Component Analysis by Adjusting for Group-Wise Stationary Noise ArXiv e-prints (arXiv:1806.01094).

Project website (https://sweichwald.de/coroICA/)

See Also

The function uwedge allows to perform to perform an approximate joint matrix diagonalization.

Examples

## Example
set.seed(1)

# Generate data from a block-wise variance model
d <- 2
m <- 10
n <- 5000
group_index <- rep(c(1,2), each=n)
partition_index <- rep(rep(1:m, each=n/m), 2)
S <- matrix(NA, 2*n, d)
H <- matrix(NA, 2*n, d)
for(i in unique(group_index)){
  varH <- abs(rnorm(d))/4
  H[group_index==i, ] <- matrix(rnorm(d*n)*rep(varH, each=n), n, d)
  for(j in unique(partition_index[group_index==i])){
    varS <- abs(rnorm(d))
    index <- partition_index==j & group_index==i
    S[index,] <- matrix(rnorm(d*n/m)*rep(varS, each=n/m),
                                                     n/m, d)
  }
}
A <- matrix(rnorm(d^2), d, d)
A <- A%*%t(A)
X <- t(A%*%t(S+H))

# Apply coroICA
res <- coroICA(X, group_index, partition_index, pairing="allpairs", rank_components=TRUE)

# Compare results
par(mfrow=c(2,2))
plot((S+H)[,1], type="l", main="true source 1", ylab="S+H")
plot(res$Shat[,1], type="l", main="estimated source 1", ylab="Shat")
plot((S+H)[,2], type="l", main="true source 2", ylab="S+H")
plot(res$Shat[,2], type="l", main="estimated source 2", ylab="Shat")
cor(res$Shat, S+H)

[Package coroICA version 1.0.2 Index]