gofar_p {gofar} | R Documentation |
Generalize Exclusive factor extraction via co-sparse unit-rank estimation (GOFAR(P)) using k-fold crossvalidation
Description
Divide and conquer approach for low-rank and sparse coefficent matrix estimation: Exclusive extraction
Usage
gofar_p(
Yt,
X,
nrank = 3,
nlambda = 40,
family,
familygroup = NULL,
cIndex = NULL,
ofset = NULL,
control = list(),
nfold = 5,
PATH = FALSE
)
Arguments
Yt |
response matrix |
X |
covariate matrix; when X = NULL, the fucntion performs unsupervised learning |
nrank |
an integer specifying the desired rank/number of factors |
nlambda |
number of lambda values to be used along each path |
family |
set of family gaussian, bernoulli, possion |
familygroup |
index set of the type of multivariate outcomes: "1" for Gaussian, "2" for Bernoulli, "3" for Poisson outcomes |
cIndex |
control index, specifying index of control variable in the design matrix X |
ofset |
offset matrix specified |
control |
a list of internal parameters controlling the model fitting |
nfold |
number of fold for cross-validation |
PATH |
TRUE/FALSE for generating solution path of sequential estimate after cross-validation step |
Value
C |
estimated coefficient matrix; based on GIC |
Z |
estimated control variable coefficient matrix |
Phi |
estimted dispersion parameters |
U |
estimated U matrix (generalize latent factor weights) |
D |
estimated singular values |
V |
estimated V matrix (factor loadings) |
lam |
selected lambda values based on the chosen information criterion |
lampath |
sequences of lambda values used in model fitting. In each sequential unit-rank estimation step, a sequence of length nlambda is first generated between (lamMaxlamMaxFac, lamMaxlamMaxFac*lamMinFac) equally spaced on the log scale, in which lamMax is estimated and the other parameters are specified in gofar_control. The model fitting starts from the largest lambda and stops when the maximum proportion of nonzero elements is reached in either u or v, as specified by spU and spV in gofar_control. |
IC |
values of information criteria |
Upath |
solution path of U |
Dpath |
solution path of D |
Vpath |
solution path of D |
ObjDec |
boolian type matrix outcome showing if objective function is monotone decreasing or not. |
familygroup |
spcified familygroup of outcome variables. |
References
Mishra, Aditya, Dipak K. Dey, Yong Chen, and Kun Chen. Generalized co-sparse factor regression. Computational Statistics & Data Analysis 157 (2021): 107127
Examples
family <- list(gaussian(), binomial(), poisson())
control <- gofar_control()
nlam <- 40 # number of tuning parameter
SD <- 123
# Simulated data for testing
data('simulate_gofar')
attach(simulate_gofar)
q <- ncol(Y)
p <- ncol(X)
# Simulate data with 20% missing entries
miss <- 0.20 # Proportion of entries missing
t.ind <- sample.int(n * q, size = miss * n * q)
y <- as.vector(Y)
y[t.ind] <- NA
Ym <- matrix(y, n, q)
naind <- (!is.na(Ym)) + 0 # matrix(1,n,q)
misind <- any(naind == 0) + 0
#
# Model fitting begins:
control$epsilon <- 1e-7
control$spU <- 50 / p
control$spV <- 25 / q
control$maxit <- 1000
# Model fitting: GOFAR(P) (full data)
set.seed(SD)
rank.est <- 5
fit.eea <- gofar_p(Y, X,
nrank = rank.est, nlambda = nlam,
family = family, familygroup = familygroup,
control = control, nfold = 5
)
# Model fitting: GOFAR(P) (missing data)
set.seed(SD)
rank.est <- 5
fit.eea.m <- gofar_p(Ym, X,
nrank = rank.est, nlambda = nlam,
family = family, familygroup = familygroup,
control = control, nfold = 5
)