kfold.vld {LGDtoolkit}R Documentation

K-fold model cross-validation

Description

kfold.vld performs k-fold model cross-validation. The main goal of this procedure is to generate main model performance metrics such as absolute mean square error, root mean square error or R-squared based on resampling method. Note that functions' argument model accepts "lm" and "glm" class but for "glm" only "quasibinomial("logit")" family will be considered.

Usage

kfold.vld(model, k = 10, seed = 1984)

Arguments

model

Model in use, an object of class inheriting from "lm"

k

Number of folds. If k is equal or greater than the number of observations of modeling data frame, then validation procedure is equivalent to leave one out cross-validation (LOOCV) method. For LOOCV, R-squared is not calculated. Default is set to 10.

seed

Random seed needed for ensuring the result reproducibility. Default is 1984.

Value

The command kfold.vld returns a list of two objects.
The first object (iter), returns iteration performance metrics.
The second object (summary), is the data frame of iterations averages of performance metrics.

Examples

library(monobin)
library(LGDtoolkit)
data(lgd.ds.c)
#discretized some risk factors
num.rf <- c("rf_01", "rf_02", "rf_03", "rf_09", "rf_16")
for	(i in 1:length(num.rf)) {
num.rf.l <- num.rf[i]
lgd.ds.c[, num.rf.l] <- sts.bin(x = lgd.ds.c[, num.rf.l], y = lgd.ds.c[, "lgd"])[[2]]	
}
str(lgd.ds.c)
#run linear regression model
reg.mod.1 <- lm(lgd ~ ., data = lgd.ds.c[, c(num.rf, "lgd")])
summary(reg.mod.1)$coefficients
#perform k-fold validation
LGDtoolkit::kfold.vld(model = reg.mod.1 , k = 10, seed = 1984)
#run fractional logistic regression model
lgd.ds.c$lgd[lgd.ds.c$lgd > 1] <- 1
reg.mod.2 <- glm(lgd ~ ., family = quasibinomial("logit"), data = lgd.ds.c[, c(num.rf, "lgd")])
summary(reg.mod.2)$coefficients
LGDtoolkit::kfold.vld(model = reg.mod.2 , k = 10, seed = 1984)

[Package LGDtoolkit version 0.2.0 Index]