t.test {MachineShop} | R Documentation |
Paired t-Tests for Model Comparisons
Description
Paired t-test comparisons of resampled performance metrics from different models.
Usage
## S3 method for class 'PerformanceDiff'
t.test(x, adjust = "holm", ...)
Arguments
x |
performance difference result. |
adjust |
p-value adjustment for multiple statistical comparisons as
implemented by |
... |
arguments passed to other methods. |
Details
The t-test statistic for pairwise model differences of R
resampled
performance metric values is calculated as
t = \frac{\bar{x}_R}{\sqrt{F s^2_R / R}},
where \bar{x}_R
and s^2_R
are the sample mean and variance.
Statistical testing for a mean difference is then performed by comparing
t
to a t_{R-1}
null distribution. The sample variance in the
t statistic is known to underestimate the true variances of cross-validation
mean estimators. Underestimation of these variances will lead to increased
probabilities of false-positive statistical conclusions. Thus, an additional
factor F
is included in the t statistic to allow for variance
corrections. A correction of F = 1 + K / (K - 1)
was found by
Nadeau and Bengio (2003) to be a good choice for cross-validation with
K
folds and is thus used for that resampling method. The extension of
this correction by Bouchaert and Frank (2004) to F = 1 + T K / (K - 1)
is used for cross-validation with K
folds repeated T
times. For
other resampling methods F = 1
.
Value
PerformanceDiffTest
class object that inherits from
array
. p-values and mean differences are contained in the lower and
upper triangular portions, respectively, of the first two dimensions. Model
pairs are contained in the third dimension.
References
Nadeau, C., & Bengio, Y. (2003). Inference for the generalization error. Machine Learning, 52, 239–81.
Bouckaert, R. R., & Frank, E. (2004). Evaluating the replicability of significance tests for comparing learning algorithms. In H. Dai, R. Srikant, & C. Zhang (Eds.), Advances in knowledge discovery and data mining (pp. 3–12). Springer.
Examples
## Requires prior installation of suggested package gbm to run
## Numeric response example
fo <- sale_amount ~ .
control <- CVControl()
gbm_res1 <- resample(fo, ICHomes, GBMModel(n.trees = 25), control)
gbm_res2 <- resample(fo, ICHomes, GBMModel(n.trees = 50), control)
gbm_res3 <- resample(fo, ICHomes, GBMModel(n.trees = 100), control)
res <- c(GBM1 = gbm_res1, GBM2 = gbm_res2, GBM3 = gbm_res3)
res_diff <- diff(res)
t.test(res_diff)