learningcurve_data {planningML} | R Documentation |
Generate descriptive summary for objects returned by functions in EHRsampling
Description
Generate descriptive summary for objects returned by functions in EHRsampling.
Usage
learningcurve_data(
x,
y,
method = "log",
metric = "MCC",
batchsize = 60,
class.prob,
pct.train = 0.8,
nfold = 5,
nrepeat = 10
)
Arguments
x |
a matrix of predictor variables |
y |
a vector of binary outcome, encoded as a factor and denoted by 1 for events and 0 for non-events |
method |
training method to get performance measurements. Available options are "log" (logistic regression, default), "regul.log" (regularized logistic regression), "svm" (support vector machine), "rf" (random forest) and "lda" (linear discriminant analysis) |
metric |
default = "MCC". The target performance estimation metric that you want to optimize. Other choice can be "AUC". |
batchsize |
sample size for each training batch |
class.prob |
probability of the event |
pct.train |
the percentage of data that goes to training. Default is 0.8 |
nfold |
number of folds in cross validation |
nrepeat |
number of repeats for cross validation |
Value
learningcurve_data()
returns a data frame of sample size and the corresponding performance measurements.