pt_oob {dsos}  R Documentation 
Permutation Test With OutOfBag Scores
Description
Test for no adverse shift with outlier scores. Like goodnessoffit testing,
this twosample comparison takes the training set, x_train
as the
as the reference. The method checks whether the test set, x_test
, is
worse off relative to this reference set. The function scorer
assigns
an outlier score to each instance/observation in both training and test set.
Usage
pt_oob(x_train, x_test, scorer, n_pt = 2000)
Arguments
x_train 
Training (reference/validation) sample. 
x_test 
Test sample. 
scorer 
Function which returns a named list with outlier scores from
the training and test sample. The first argument to 
n_pt 
The number of permutations. 
Details
The null distribution of the test statistic is based on n_pt
permutations. For speed, this is implemented as a sequential Monte Carlo test
with the simctest package. See Gandy (2009) for details. The prefix
pt refers to permutation test. This approach does not use the
asymptotic null distribution for the test statistic. This is the recommended
approach for small samples. The test statistic is the weighted AUC (WAUC).
Value
A named list of class outlier.test
containing:

statistic
: observed WAUC statistic 
seq_mct
: sequential Monte Carlo test, when applicable 
p_value
: pvalue 
outlier_scores
: outlier scores from training and test set
Notes
The scoring function, scorer
, predicts outofbag scores to mimic
outofsample behaviour. The suffix oob stands for outofbag to
highlight this point. This outofbag variant avoids refitting the
underlying algorithm from scorer
at every permutation. It can, as a
result, be computationally appealing.
References
Kamulete, V. M. (2022). Test for nonnegligible adverse shifts. In The 38th Conference on Uncertainty in Artificial Intelligence. PMLR.
Gandy, A. (2009). Sequential implementation of Monte Carlo tests with uniformly bounded resampling risk. Journal of the American Statistical Association, 104(488), 15041511.
See Also
[pt_refit()] for (slower) pvalue approximation via refitting. [at_oob()] for pvalue approximation from asymptotic null distribution.
Other permutationtest:
pt_from_os()
,
pt_refit()
Examples
library(dsos)
set.seed(12345)
data(iris)
idx < sample(nrow(iris), 2 / 3 * nrow(iris))
iris_train < iris[idx, ]
iris_test < iris[idx, ]
# Use a synthetic (fake) scoring function for illustration
scorer < function(tr, te) list(train=runif(nrow(tr)), test=runif(nrow(te)))
pt_test < pt_oob(iris_train, iris_test, scorer = scorer)
pt_test