| MGLMsparsereg {MGLM} | R Documentation | 
Fit multivariate GLM sparse regression
Description
Fit sparse regression in multivariate generalized linear models.
Usage
MGLMsparsereg(
  formula,
  data,
  dist,
  lambda,
  penalty,
  weight,
  init,
  penidx,
  maxiters = 150,
  ridgedelta,
  epsilon = 1e-05,
  regBeta = FALSE,
  overdisp
)
MGLMsparsereg.fit(
  Y,
  X,
  dist,
  lambda,
  penalty,
  weight,
  init,
  penidx,
  maxiters = 150,
  ridgedelta,
  epsilon = 1e-05,
  regBeta = FALSE,
  overdisp
)
Arguments
| formula | an object of class  | 
| data | an optional data frame, list or environment (or object coercible by 
 | 
| dist | a description of the error distribution to fit. See  | 
| lambda | penalty parameter. | 
| penalty | penalty type for the regularization term. Can be chosen from  | 
| weight | an optional vector of weights assigned to each row of the data. 
Should be  | 
| init | an optional matrix of initial value of the parameter estimates.
Should have the compatible dimension with the data. See  | 
| penidx | a logical vector indicating the variables to be penalized. The default value is  | 
| maxiters | an optional numeric controlling the maximum number of iterations. The default value is maxiters=150. | 
| ridgedelta | an optional numeric controlling the behavior of the Nesterov's accelerated proximal gradient method. The default value is  | 
| epsilon | an optional numeric controlling the stopping criterion. The algorithm terminates when the relative change in the objective values of two successive iterates is less then  | 
| regBeta | an optional logical variable used when running negative multinomial regression ( | 
| overdisp | an optional numerical variable used only when fitting sparse negative multinomial 
model  | 
| Y | a matrix containing the multivariate categorical response data. 
Rows of the matrix represent observations, while columns are the different
categories.  Rows and columns of all zeros are automatically removed when
 | 
| X | design matrix (including intercept).
Number of rows of the matrix should match that of  | 
Details
In general, we consider regularization problem
\min_B h(B) = -l(B)+ J(B),
where l(B) is the loglikelihood function and J(B) is the 
regularization function.  
Sparsity in the individual elements of the parameter matrix B is achieved 
by the lasso penalty (dist="sweep")
J(B) = \lambda \sum_{k\in penidx} \sum_{j=1}^d \|B_{kj}\|
Sparsity in the rows of the regression parameter matrix B is achieved
by the group penalty (dist="group")
J(B) = \lambda \sum_{k \in penidx} \|B_{k \cdot}\|_2,
where \|v\|_2 is the l_2 norm of a vector v. In other words, 
\|B_{k\cdot}\|_2 is the l_2 norm of the k-th row of the 
parameter matrix B.
Sparsity in the rank of the parameter matrix B is achieved by the nuclear norm penalty (dist="nuclear")
J(B) = \lambda \|B\|_*= \lambda \sum_{i=1}^{min(p, d)} \sigma_i(B),
where \sigma_i(B) are the singular values of the parameter matrix B. 
The nuclear norm \|B\|_* is a convex relaxation of rank(B)=\|\sigma(B)\|_0.
See dist for details about distributions.
Value
Returns an object of class "MGLMsparsereg". An object of class "MGLMsparsereg" is a list containing at least the following components:  
- coefficientsthe estimated matrix of regression coefficients.
- logLthe final loglikelihood value.
- AICAkaike information criterion.
- BICBayesian information criterion.
- Dofdegrees of freedom of the estimated parameter.
- iternumber of iterations used.
- maxlambdathe maxmum tuning parameter such that the estimated coefficients are not all zero. This value is returned only when the tuning parameter- lambdagiven to the function is large enough such that all the parameter estimates are zero; otherwise,- maxlambdais not computed.
- calla matched call.
- datathe data used to fit the model: a list of the predictor matrix and the response matrix.
- penaltythe penalty chosen when running the penalized regression.
Author(s)
Yiwen Zhang and Hua Zhou
Examples
## Generate Dirichlet Multinomial data
dist <- "DM"
n <- 100
p <- 15
d <- 5
m <- runif(n, min=0, max=25) + 25
set.seed(134)
X <- matrix(rnorm(n*p),n, p)
alpha <- matrix(0, p, d)
alpha[c(1,3, 5), ] <- 1
Alpha <- exp(X%*%alpha)
Y <- rdirmn(size=m, alpha=Alpha)
## Tuning
ngridpt <- 10
p <- ncol(X)
d <- ncol(Y)
pen <- 'nuclear'
spfit <- MGLMsparsereg(formula=Y~0+X, dist=dist, lambda=Inf, penalty=pen)