cv.DMR {DMRnet}R Documentation

cross-validation for DMR

Description

Executes k-fold cross-validation for DMR and returns a value for df.

Usage

cv.DMR(
  X,
  y,
  family = "gaussian",
  clust.method = "complete",
  lam = 10^(-7),
  nfolds = 10,
  indexation.mode = "GIC"
)

Arguments

X

Input data frame, of dimension n x p; DMR works only if p<n, for p>=n see DMRnet; each row is an observation vector. Columns can be numerical or integer for continuous predictors or factors for categorical predictors.

y

Response variable. Numerical for family="gaussian" or a factor with two levels for family="binomial". For family="binomial" the last level in alphabetical order is the target class.

family

Response type; one of: "gaussian", "binomial".

clust.method

Clustering method used for partitioning levels of factors; see function hclust in package stats for details. clust.method="complete" is the default.

lam

The amount of penalization in ridge regression (used for logistic regression in order to allow for parameter estimation in linearly separable setups) or the amount of matrix regularization in case of linear regression. Used only for numerical reasons. The default is 1e-7.

nfolds

Number of folds in cross-validation. The default value is 10.

indexation.mode

How the cross validation algorithm should index the models for internal quality comparisons; one of: "GIC" (the default) for GIC-indexed cross validation, "dimension", for model dimension-indexed cross validation.

Details

cv.DMR algorithm does cross-validation for DMR with nfolds folds. The df for the minimal estimated prediction error is returned.

Value

An object with S3 class "cv.DMR" is returned, which is a list with the ingredients of the cross-validation fit.

df.min

df (number of parameters) of the model with minimal cross-validated error.

df.1se

df (number of parameters) of the smallest model falling under the upper curve of a prediction error plus one standard deviation.

dmr.fit

Fitted DMR object for the full data.

cvm

The mean cross-validated error for the entire sequence of models.

foldid

The fold assignments used.

See Also

plot.cv.DMR for plotting, coef.cv.DMR for extracting coefficients and predict.cv.DMR for prediction.

Examples

## cv.DMR for linear regression
set.seed(13)
data(miete)
ytr <- miete$rent[1:1500]
Xtr <- miete$area[1:1500]
Xte <- miete$area[1501:2053]
cv <- cv.DMR(Xtr, ytr)
print(cv)
plot(cv)
coef(cv)
ypr <- predict(cv, newx = Xte)


[Package DMRnet version 0.4.0 Index]