kfold.vld {LGDtoolkit} | R Documentation |
K-fold model cross-validation
Description
kfold.vld
performs k-fold model cross-validation.
The main goal of this procedure is to generate main model performance metrics such as absolute mean
square error, root mean square error or R-squared based on resampling method. Note that functions' argument
model accepts "lm"
and "glm"
class but for "glm"
only "quasibinomial("logit")"
family will be considered.
Usage
kfold.vld(model, k = 10, seed = 1984)
Arguments
model |
Model in use, an object of class inheriting from |
k |
Number of folds. If |
seed |
Random seed needed for ensuring the result reproducibility. Default is 1984. |
Value
The command kfold.vld
returns a list of two objects.
The first object (iter
), returns iteration performance metrics.
The second object (summary
), is the data frame of iterations averages of performance metrics.
Examples
library(monobin)
library(LGDtoolkit)
data(lgd.ds.c)
#discretized some risk factors
num.rf <- c("rf_01", "rf_02", "rf_03", "rf_09", "rf_16")
for (i in 1:length(num.rf)) {
num.rf.l <- num.rf[i]
lgd.ds.c[, num.rf.l] <- sts.bin(x = lgd.ds.c[, num.rf.l], y = lgd.ds.c[, "lgd"])[[2]]
}
str(lgd.ds.c)
#run linear regression model
reg.mod.1 <- lm(lgd ~ ., data = lgd.ds.c[, c(num.rf, "lgd")])
summary(reg.mod.1)$coefficients
#perform k-fold validation
LGDtoolkit::kfold.vld(model = reg.mod.1 , k = 10, seed = 1984)
#run fractional logistic regression model
lgd.ds.c$lgd[lgd.ds.c$lgd > 1] <- 1
reg.mod.2 <- glm(lgd ~ ., family = quasibinomial("logit"), data = lgd.ds.c[, c(num.rf, "lgd")])
summary(reg.mod.2)$coefficients
LGDtoolkit::kfold.vld(model = reg.mod.2 , k = 10, seed = 1984)