rejSamp {sambia}R Documentation

Rejection Sampling is a method used in sambia's function 'costing' (Krautenbacher et al, 2017).

Description

Rejection Sampling is a method used in sambias costing function. It is sampling scheme that allows us to draw examples independently from a distribution X, given examples drawn independently from distribution Y.

Usage

rejSamp(data, weights)

Arguments

data

a data frame containing the observations rowwise, along with their corresponding categorical strata feature

weights

a numerical vector whose length must coincide with the number of the rows of data. The i-th value contains the inverse-probability e.g. determines how often the i-th observation of data shall be replicated.

Author(s)

Norbert Krautenbacher, Kevin Strauss, Maximilian Mandl, Christiane Fuchs

References

Krautenbacher, N., Theis, F. J., & Fuchs, C. (2017). Correcting Classifiers for Sample Selection Bias in Two-Phase Case-Control Studies. Computational and mathematical methods in medicine, 2017.

Examples

library(smotefamily)
library(sambia)
data.example <- sample_generator(100,ratio = 0.80)
result <- gsub('n','0',data.example[,'result'])
result <- gsub('p','1',result)
data.example[,'result'] <- as.numeric(result)
weights <- data.example[,'result']
weights <- ifelse(weights==1,1,4)
rej.sample <- sambia:::rejSamp(data=data.example, weights = weights)

[Package sambia version 0.1.0 Index]