compCL {Compack}  R Documentation 
Fit regression with compositional predictors via penalized logcontrast model which was proposed by Lin et al. (2014) <doi:10.1093/biomet/asu031>.
The model estimation is conducted by minimizing a linearly constrained lasso criterion. The regularization paths are
computed at a grid of tuning parameter lambda
.
compCL(y, Z, Zc = NULL, intercept = TRUE, lam = NULL, nlam = 100, lambda.factor = ifelse(n < p, 0.05, 0.001), pf = rep(1, times = p), dfmax = p, pfmax = min(dfmax * 1.5, p), u = 1, mu_ratio = 1.01, tol = 1e10, inner_maxiter = 1e+4, inner_eps = 1e6, outer_maxiter = 1e+08, outer_eps = 1e8)
y 
a response vector with length n. 
Z 
a n*p design matrix of compositional data or categorical data.
If 
Zc 
a n*p_c design matrix of control variables (not penalized). Default is 
intercept 
Boolean, specifying whether to include an intercept.
Default is 
lam 
a user supplied lambda sequence.
If 
nlam 
the length of the 
lambda.factor 
the factor for getting the minimal lambda in the 
pf 
penalty factor, a vector of length p. Zero implies no shrinkage. Default value for each entry is 1. 
dfmax 
limit the maximum number of groups in the model. Useful for handling very large p, if a partial path is desired. Default is p. 
pfmax 
limit the maximum number of groups ever to be nonzero. For example once a group enters the model along the path,
no matter how many times it reenters the model through the path, it will be counted only once.
Default is 
u 
the inital value of the penalty parameter of the augmented Lagrange method adopted in the outer loop. Default value is 1. 
mu_ratio 
the increasing ratio, with value at least 1, for 
tol 
tolerance for the estimated coefficients to be considered as nonzero, i.e., if abs(β_j) < 
inner_maxiter, inner_eps 

outer_maxiter, outer_eps 

The logcontrast regression model with compositional predictors is expressed as
y = Zβ + e, s.t. ∑_{j=1}^{p}β_j=0,
where Z is the nbyp design matrix of logtransforemd compositional data,
β is the pvector of regression cofficients,
and e is an nvector of random errors.
If zero(s) exists in the original compositional data, user should preprocess these zero(s).
To enable variable selection, we conduct model estimation via linearly constrained lasso
argmin_{β}(\frac{1}{2n}\yZβ\_2^2 + λ\β\_1), s.t. ∑_{j=1}^{p}β_j= 0.
An object with S3 calss "compCL"
is a list containing:
beta 
a matrix of coefficients for p+p_c+1 rows.
If 
lam 
the sequence of 
df 
the number of nonzero β_p's in estimated coefficients for 
npass 
total iterations. 
error 
error messages. If 0, no error occurs. 
call 
the call that produces this object. 
dim 
dimension of the coefficient matrix 
Zhe Sun and Kun Chen
Lin, W., Shi, P., Peng, R. and Li, H. (2014) Variable selection in regression with compositional covariates, https://academic.oup.com/biomet/article/101/4/785/1775476. Biometrika 101 785979
coef
, predict
,
print
and plot
methods
for "compCL"
object
and cv.compCL
and GIC.compCL
.
p = 30 n = 50 beta = c(1, 0.8, 0.6, 0, 0, 1.5, 0.5, 1.2) beta = c(beta, rep(0, times = p  length(beta))) Comp_data = comp_Model(n = n, p = p, beta = beta, intercept = FALSE) m1 < compCL(y = Comp_data$y, Z = Comp_data$X.comp, Zc = Comp_data$Zc, intercept = Comp_data$intercept) print(m1) plot(m1) beta = coef(m1) Test_data = comp_Model(n = 30, p = p, beta = Comp_data$beta, intercept = FALSE) predmat = predict(m1, Znew = Test_data$X.comp, Zcnew = Test_data$Zc)