gofar_p {gofar}R Documentation

Generalize Exclusive factor extraction via co-sparse unit-rank estimation (GOFAR(P)) using k-fold crossvalidation

Description

Divide and conquer approach for low-rank and sparse coefficent matrix estimation: Exclusive extraction

Usage

gofar_p(
  Yt,
  X,
  nrank = 3,
  nlambda = 40,
  family,
  familygroup = NULL,
  cIndex = NULL,
  ofset = NULL,
  control = list(),
  nfold = 5,
  PATH = FALSE
)

Arguments

Yt

response matrix

X

covariate matrix; when X = NULL, the fucntion performs unsupervised learning

nrank

an integer specifying the desired rank/number of factors

nlambda

number of lambda values to be used along each path

family

set of family gaussian, bernoulli, possion

familygroup

index set of the type of multivariate outcomes: "1" for Gaussian, "2" for Bernoulli, "3" for Poisson outcomes

cIndex

control index, specifying index of control variable in the design matrix X

ofset

offset matrix specified

control

a list of internal parameters controlling the model fitting

nfold

number of fold for cross-validation

PATH

TRUE/FALSE for generating solution path of sequential estimate after cross-validation step

Value

C

estimated coefficient matrix; based on GIC

Z

estimated control variable coefficient matrix

Phi

estimted dispersion parameters

U

estimated U matrix (generalize latent factor weights)

D

estimated singular values

V

estimated V matrix (factor loadings)

lam

selected lambda values based on the chosen information criterion

lampath

sequences of lambda values used in model fitting. In each sequential unit-rank estimation step, a sequence of length nlambda is first generated between (lamMaxlamMaxFac, lamMaxlamMaxFac*lamMinFac) equally spaced on the log scale, in which lamMax is estimated and the other parameters are specified in gofar_control. The model fitting starts from the largest lambda and stops when the maximum proportion of nonzero elements is reached in either u or v, as specified by spU and spV in gofar_control.

IC

values of information criteria

Upath

solution path of U

Dpath

solution path of D

Vpath

solution path of D

ObjDec

boolian type matrix outcome showing if objective function is monotone decreasing or not.

familygroup

spcified familygroup of outcome variables.

References

Mishra, Aditya, Dipak K. Dey, Yong Chen, and Kun Chen. Generalized co-sparse factor regression. Computational Statistics & Data Analysis 157 (2021): 107127

Examples


family <- list(gaussian(), binomial(), poisson())
control <- gofar_control()
nlam <- 40 # number of tuning parameter
SD <- 123

# Simulated data for testing

data('simulate_gofar')
attach(simulate_gofar)
q <- ncol(Y)
p <- ncol(X)
# Simulate data with 20% missing entries
miss <- 0.20 # Proportion of entries missing
t.ind <- sample.int(n * q, size = miss * n * q)
y <- as.vector(Y)
y[t.ind] <- NA
Ym <- matrix(y, n, q)
naind <- (!is.na(Ym)) + 0 # matrix(1,n,q)
misind <- any(naind == 0) + 0
#
# Model fitting begins:
control$epsilon <- 1e-7
control$spU <- 50 / p
control$spV <- 25 / q
control$maxit <- 1000
# Model fitting: GOFAR(P) (full data)
set.seed(SD)
rank.est <- 5

fit.eea <- gofar_p(Y, X,
  nrank = rank.est, nlambda = nlam,
  family = family, familygroup = familygroup,
  control = control, nfold = 5
)

# Model fitting: GOFAR(P) (missing data)
set.seed(SD)
rank.est <- 5
fit.eea.m <- gofar_p(Ym, X,
  nrank = rank.est, nlambda = nlam,
  family = family, familygroup = familygroup,
  control = control, nfold = 5
)


[Package gofar version 0.1 Index]