t.test {MachineShop} | R Documentation |
Paired t-Tests for Model Comparisons
Description
Paired t-test comparisons of resampled performance metrics from different models.
Usage
## S3 method for class 'PerformanceDiff'
t.test(x, adjust = "holm", ...)
Arguments
x |
performance difference result. |
adjust |
p-value adjustment for multiple statistical comparisons as
implemented by |
... |
arguments passed to other methods. |
Details
The t-test statistic for pairwise model differences of resampled
performance metric values is calculated as
where and
are the sample mean and variance.
Statistical testing for a mean difference is then performed by comparing
to a
null distribution. The sample variance in the
t statistic is known to underestimate the true variances of cross-validation
mean estimators. Underestimation of these variances will lead to increased
probabilities of false-positive statistical conclusions. Thus, an additional
factor
is included in the t statistic to allow for variance
corrections. A correction of
was found by
Nadeau and Bengio (2003) to be a good choice for cross-validation with
folds and is thus used for that resampling method. The extension of
this correction by Bouchaert and Frank (2004) to
is used for cross-validation with
folds repeated
times. For
other resampling methods
.
Value
PerformanceDiffTest
class object that inherits from
array
. p-values and mean differences are contained in the lower and
upper triangular portions, respectively, of the first two dimensions. Model
pairs are contained in the third dimension.
References
Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239–81.
Bouckaert, R. R., & Frank, E. (2004). Evaluating the replicability of significance tests for comparing learning algorithms. In H. Dai, R. Srikant, & C. Zhang (Eds.), Advances in knowledge discovery and data mining (pp. 3–12). Springer.
Examples
## Requires prior installation of suggested package gbm to run
## Numeric response example
fo <- sale_amount ~ .
control <- CVControl()
gbm_res1 <- resample(fo, ICHomes, GBMModel(n.trees = 25), control)
gbm_res2 <- resample(fo, ICHomes, GBMModel(n.trees = 50), control)
gbm_res3 <- resample(fo, ICHomes, GBMModel(n.trees = 100), control)
res <- c(GBM1 = gbm_res1, GBM2 = gbm_res2, GBM3 = gbm_res3)
res_diff <- diff(res)
t.test(res_diff)