R: prioritylasso with several block specifications

cvm_prioritylasso {prioritylasso}

R Documentation

prioritylasso with several block specifications

Description

Runs prioritylasso for a list of block specifications and gives the best results in terms of cv error.

Usage

cvm_prioritylasso(
  X,
  Y,
  weights,
  family,
  type.measure,
  blocks.list,
  max.coef.list = NULL,
  block1.penalization = TRUE,
  lambda.type = "lambda.min",
  standardize = TRUE,
  nfolds = 10,
  foldid,
  cvoffset = FALSE,
  cvoffsetnfolds = 10,
  ...
)

Arguments

`X`	a (nxp) matrix of predictors with observations in rows and predictors in columns.
`Y`	n-vector giving the value of the response (either continuous, numeric-binary 0/1, or `Surv` object).
`weights`	observation weights. Default is 1 for each observation.
`family`	should be "gaussian" for continuous `Y`, "binomial" for binary `Y`, "cox" for `Y` of type `Surv`.
`type.measure`	accuracy/error measure computed in cross-validation. It should be "class" (classification error) or "auc" (area under the ROC curve) if `family="binomial"`, "mse" (mean squared error) if `family="gaussian"` and "deviance" if `family="cox"` which uses the partial-likelihood.
`blocks.list`	list of the format `list(list(bp1=...,bp2=...,), list(bp1=,...,bp2=...,), ...)`. For the specification of the entries, see `prioritylasso`.
`max.coef.list`	list of `max.coef` vectors. The first entries are omitted if `block1.penalization = FALSE`. Default is `NULL`.
`block1.penalization`	whether the first block should be penalized. Default is TRUE.
`lambda.type`	specifies the value of lambda used for the predictions. `lambda.min` gives lambda with minimum cross-validated errors. `lambda.1se` gives the largest value of lambda such that the error is within 1 standard error of the minimum. Note that `lambda.1se` can only be chosen without restrictions of `max.coef`.
`standardize`	logical, whether the predictors should be standardized or not. Default is TRUE.
`nfolds`	the number of CV procedure folds.
`foldid`	an optional vector of values between 1 and nfold identifying what fold each observation is in.
`cvoffset`	logical, whether CV should be used to estimate the offsets. Default is FALSE.
`cvoffsetnfolds`	the number of folds in the CV procedure that is performed to estimate the offsets. Default is 10. Only relevant if `cvoffset=TRUE`.
`...`	other arguments that can be passed to the function `prioritylasso`.

Value

object of class cvm_prioritylasso with the following elements. If these elements are lists, they contain the results for each penalized block of the best result.

lambda.ind: list with indices of lambda for lambda.type.
lambda.type: type of lambda which is used for the predictions.
lambda.min: list with values of lambda for lambda.type.
min.cvm: list with the mean cross-validated errors for lambda.type.
nzero: list with numbers of non-zero coefficients for lambda.type.
glmnet.fit: list of fitted glmnet objects.
name: a text string indicating type of measure.
block1unpen: if block1.penalization = FALSE, the results of either the fitted glm or coxph object.
best.blocks: character vector with the indices of the best block specification.
best.blocks.indices: list with the indices of the best block specification ordered by best to worst.
best.max.coef: vector with the number of maximal coefficients corresponding to best.blocks.
best.model: complete prioritylasso model of the best solution.
coefficients: coefficients according to the results obtained with best.blocks.
call: the function call.

Note

The function description and the first example are based on the R package ipflasso.

Author(s)

Simon Klau
Maintainer: Roman Hornung (hornung@ibe.med.uni-muenchen.de)

References

Klau, S., Jurinovic, V., Hornung, R., Herold, T., Boulesteix, A.-L. (2018). Priority-Lasso: a simple hierarchical approach to the prediction of clinical outcome using multi-omics data. BMC Bioinformatics 19, 322

Examples

cvm_prioritylasso(X = matrix(rnorm(50*500),50,500), Y = rnorm(50), family = "gaussian",
                  type.measure = "mse", lambda.type = "lambda.min", nfolds = 5,
                  blocks.list = list(list(bp1=1:75, bp2=76:200, bp3=201:500),
                                     list(bp1=1:75, bp2=201:500, bp3=76:200)))
## Not run: 
cvm_prioritylasso(X = pl_data[,1:1028], Y = pl_data[,1029], family = "binomial",
                  type.measure = "auc", standardize = FALSE, block1.penalization = FALSE,
                  blocks.list = list(list(1:4, 5:9, 10:28, 29:1028),
                                     list(1:4, 5:9, 29:1028, 10:28)),
                  max.coef.list = list(c(Inf, Inf, Inf, 10), c(Inf, Inf, 10, Inf)))
## End(Not run)

[Package prioritylasso version 0.3.1 Index]