R: Penalized reduced rank regression in a GLAM

glamlassoRR {glamlasso}

R Documentation

Penalized reduced rank regression in a GLAM

Description

Efficient design matrix free procedure for fitting large scale penalized reduced rank regressions in a 3-dimensional generalized linear array model. To obtain a factorization of the parameter array, the glamlassoRR function performes a block relaxation scheme within the gdpg algorithm, see Lund and Hansen, 2018.

Usage

glamlassoRR(X, 
            Y, 
            Z = NULL,
            family = "gaussian",
            penalty = "lasso",
            intercept = FALSE,
            weights = NULL,
            betainit = NULL,
            alphainit = NULL,
            nlambda = 100,
            lambdaminratio = 1e-04,
            lambda = NULL,
            penaltyfactor = NULL,
            penaltyfactoralpha = NULL,
            reltolinner = 1e-07,
            reltolouter = 1e-04,
            reltolalt = 1e-04,
            maxiter = 15000,
            steps = 1,
            maxiterinner = 3000,
            maxiterouter = 25,
            maxalt = 10,
            btinnermax = 100,
            btoutermax  = 100,
            iwls = "exact",
            nu = 1)

Arguments

`X`	A list containing the 3 tensor components of the tensor design matrix. These are matrices of sizes `n_i \times p_i`.
`Y`	The response values, an array of size `n_1 \times n_2\times n_3`. For option `family = "binomial"` this array must contain the proportion of successes and the number of trials is then specified as `weights` (see below).
`Z`	The non tensor structrured part of the design matrix. A matrix of size `n_1 n_2 n_3\times q`. Is set to `NULL` as default.
`family`	A string specifying the model family (essentially the response distribution). Possible values are `"gaussian", "binomial", "poisson", "gamma"`.
`penalty`	A string specifying the penalty. Possible values are `"lasso", "scad"`.
`intercept`	Logical variable indicating if the model includes an intercept. When `intercept = TRUE` the first coulmn in the non-tensor design component `Z` is all 1s. Default is `FALSE`.
`weights`	Observation weights, an array of size `n_1 \times \cdots \times n_d`. For option `family = "binomial"` this array must contain the number of trials and must be provided.
`betainit`	A list (length 2) containing the initial parameter values for each of the parameter factors. Default is NULL in which case all parameters are initialized at 0.01.
`alphainit`	A `q\times 1` vector containing the initial parameter values for the non-tensor parameter. Default is NULL in which case all parameters are initialized at 0.
`nlambda`	The number of `lambda` values.
`lambdaminratio`	The smallest value for `lambda`, given as a fraction of `\lambda_{max}`; the (data derived) smallest value for which all coefficients are zero.
`lambda`	The sequence of penalty parameters for the regularization path.
`penaltyfactor`	A list of length two containing an array of size `p_1 \times p_2` and a `p_3 \times 1` vector. Multiplied with each element in `lambda` to allow differential shrinkage on the (tensor) coefficients blocks.
`penaltyfactoralpha`	A `q \times 1` vector multiplied with each element in `lambda` to allow differential shrinkage on the non-tensor coefficients.
`reltolinner`	The convergence tolerance for the inner loop
`reltolouter`	The convergence tolerance for the outer loop.
`reltolalt`	The convergence tolerance for the alternation loop over the two parameter blocks.
`maxiter`	The maximum number of inner iterations allowed for each `lambda` value, when summing over all outer iterations for said `lambda`.
`steps`	The number of steps used in the multi-step adaptive lasso algorithm for non-convex penalties. Automatically set to 1 when `penalty = "lasso"`.
`maxiterinner`	The maximum number of inner iterations allowed for each outer iteration.
`maxiterouter`	The maximum number of outer iterations allowed for each lambda.
`maxalt`	The maximum number of alternations over parameter blocks.
`btinnermax`	Maximum number of backtracking steps allowed in each inner iteration. Default is `btinnermax = 100`.
`btoutermax`	Maximum number of backtracking steps allowed in each outer iteration. Default is `btoutermax = 100`.
`iwls`	A string indicating whether to use the exact iwls weight matrix or use a tensor structured approximation to it.
`nu`	A number between 0 and 1 that controls the step size `\delta` in the proximal algorithm (inner loop) by scaling the upper bound `\hat{L}_h` on the Lipschitz constant `L_h` (see Lund et al., 2017). For `nu = 1` backtracking never occurs and the proximal step size is always `\delta = 1 / \hat{L}_h`. For `nu = 0` backtracking always occurs and the proximal step size is initially `\delta = 1`. For `0 < nu < 1` the proximal step size is initially `\delta = 1/(\nu\hat{L}_h)` and backtracking is only employed if the objective function does not decrease. A `nu` close to 0 gives large step sizes and presumably more backtracking in the inner loop. The default is `nu = 1` and the option is only used if `iwls = "exact"`.

Details

Given the setting from glamlasso we place a reduced rank restriction on the p_1\times p_2\times p _3 parameter array B given by

B=(B_{i,j,k})_{i,j,k} = (\gamma_{k}\kappa_{i,j})_{i,j,k}, \ \ \ \gamma_k,\kappa_{i,j}\in \mathcal{R}.

The glamlassoRR function solves the PMLE problem by combining a block relaxation scheme with the gdpg algorithm. This scheme alternates between optimizing over the first parameter block \kappa=(\kappa_{i,j})_{i,j} and the second block \gamma=(\gamma_k)_k while fixing the second resp. first block.

Note that the individual parameter blocks are only identified up to a multiplicative constant. Also note that the algorithm is sensitive to inital values betainit which can prevent convergence.

Value

An object with S3 Class "glamlasso".

`spec`	A string indicating the model family and the penalty.
`coef12`	A `p_1 p_2 \times` `nlambda` matrix containing the estimates of the first model coefficient factor (`\kappa`) for each `lambda`-value.
`coef3`	A `p_3 \times` `nlambda` matrix containing the estimates of the second model coefficient factor (`\gamma`) for each `lambda`-value.
`alpha`	A `q \times` `nlambda` matrix containing the estimates of the parameters for the non tensor structured part of the model (`alpha`) for each `lambda`-value. If `intercept = TRUE` the first row contains the intercept estimate for each `lambda`-value.
`lambda`	A vector containing the sequence of penalty values used in the estimation procedure.
`df`	The number of nonzero coefficients for each value of `lambda`.
`dimcoef`	A vector giving the dimension of the model coefficient array `\beta`.
`dimobs`	A vector giving the dimension of the observation (response) array `Y`.
`Iter`	A list with 4 items: `bt_iter_inner` is total number of backtracking steps performed in the inner loop, `bt_enter_inner` is the number of times the backtracking is initiated in the inner loop, `bt_iter_outer` is total number of backtracking steps performed in the outer loop, and `iter_mat` is a `nlambda` `\times` `maxiterouter` matrix containing the number of inner iterations for each `lambda` value and each outer iteration and `iter` is total number of iterations i.e. `sum(Iter)`.

Author(s)

Adam Lund

Maintainer: Adam Lund, adam.lund@math.ku.dk

References

Lund, A., M. Vincent, and N. R. Hansen (2017). Penalized estimation in large-scale generalized linear array models. Journal of Computational and Graphical Statistics, 26, 3, 709-724. url = https://doi.org/10.1080/10618600.2017.1279548.

Lund, A. and N. R. Hansen (2019). Sparse Network Estimation for Dynamical Spatio-temporal Array Models. Journal of Multivariate Analysis, 174. url = https://doi.org/10.1016/j.jmva.2019.104532.

Examples

 
##size of example 
n1 <- 65; n2 <- 26; n3 <- 13; p1 <- 12; p2 <- 6; p3 <- 4

##marginal design matrices (tensor components)
X1 <- matrix(rnorm(n1 * p1), n1, p1) 
X2 <- matrix(rnorm(n2 * p2), n2, p2) 
X3 <- matrix(rnorm(n3 * p3), n3, p3) 
X <- list(X1, X2, X3)
Beta12 <- matrix(rnorm(p1 * p2), p1, p2) * matrix(rbinom(p1 * p2, 1, 0.5), p1, p2)
Beta3 <- matrix(rnorm(p3) * rbinom(p3, 1, 0.5), p3, 1)
Beta <- outer(Beta12, c(Beta3))
Mu <- RH(X3, RH(X2, RH(X1, Beta)))
Y <- array(rnorm(n1 * n2 * n3, Mu), dim = c(n1, n2, n3))  

system.time(fit <- glamlassoRR(X, Y))

modelno  <- length(fit$lambda)
oldmfrow <- par()$mfrow
par(mfrow = c(1, 3))
plot(c(Beta), type = "h")
points(c(Beta))
lines(c(outer(fit$coef12[, modelno], c(fit$coef3[, modelno]))), col = "red", type = "h")
plot(c(Beta12), ylim = range(Beta12, fit$coef12[, modelno]), type = "h")
points(c(Beta12))
lines(fit$coef12[, modelno], col = "red", type = "h")
plot(c(Beta3), ylim = range(Beta3, fit$coef3[, modelno]), type = "h")
points(c(Beta3))
lines(fit$coef3[, modelno], col = "red", type = "h")
par(mfrow = oldmfrow)

###with non tensor design component Z
q <- 5
alpha <- matrix(rnorm(q)) * rbinom(q, 1, 0.5)
Z <- matrix(rnorm(n1 * n2 * n3 * q), n1 * n2 * n3, q) 
Y <- array(rnorm(n1 * n2 * n3, Mu + array(Z %*% alpha, c(n1, n2, n3))), c(n1, n2, n3))
system.time(fit <- glamlassoRR(X, Y, Z))

modelno <- length(fit$lambda)
oldmfrow <- par()$mfrow
par(mfrow = c(2, 2))
plot(c(Beta), type = "h")
points(c(Beta))
lines(c(outer(fit$coef12[, modelno], c(fit$coef3[, modelno]))), col = "red", type = "h")
plot(c(Beta12), ylim = range(Beta12,fit$coef12[, modelno]), type = "h")
points(c(Beta12))
lines(fit$coef12[, modelno], col = "red", type = "h")
plot(c(Beta3), ylim = range(Beta3, fit$coef3[, modelno]), type = "h")
points(c(Beta3))
lines(fit$coef3[, modelno], col = "red", type = "h")
plot(c(alpha), ylim = range(alpha, fit$alpha[, modelno]), type = "h")
points(c(alpha))
lines(fit$alpha[, modelno], col = "red", type = "h")
par(mfrow = oldmfrow)

################ poisson example
set.seed(7954) ## for this seed the algorithm fails to converge for default initial values!!
set.seed(42)
##size of example 
n1 <- 65; n2 <- 26; n3 <- 13; p1 <- 12; p2 <- 6; p3 <- 4

##marginal design matrices (tensor components)
X1 <- matrix(rnorm(n1 * p1), n1, p1) 
X2 <- matrix(rnorm(n2 * p2), n2, p2) 
X3 <- matrix(rnorm(n3 * p3), n3, p3) 
X <- list(X1, X2, X3)

Beta12 <- matrix(rnorm(p1 * p2, 0, 0.5) * rbinom(p1 * p2, 1, 0.1), p1, p2) 
Beta3 <-  matrix(rnorm(p3, 0, 0.5) * rbinom(p3, 1, 0.5), p3, 1)
Beta <- outer(Beta12, c(Beta3))
Mu <- RH(X3, RH(X2, RH(X1, Beta)))
Y <- array(rpois(n1 * n2 * n3, exp(Mu)), dim = c(n1, n2, n3))
system.time(fit <- glamlassoRR(X, Y ,family = "poisson"))
modelno <- length(fit$lambda)
oldmfrow <- par()$mfrow
par(mfrow = c(1, 3))
plot(c(Beta), type = "h")
points(c(Beta))
lines(c(outer(fit$coef12[, modelno], c(fit$coef3[, modelno]))), col = "red", type = "h")
plot(c(Beta12), ylim = range(Beta12, fit$coef12[, modelno]), type = "h")
points(c(Beta12))
lines(fit$coef12[, modelno], col = "red", type = "h")
plot(c(Beta3), ylim = range(Beta3, fit$coef3[, modelno]), type = "h")
points(c(Beta3))
lines(fit$coef3[, modelno], col = "red", type = "h")
par(mfrow = oldmfrow)

[Package glamlasso version 3.0.1 Index]