cont_conv {cctools}R Documentation

Continuous convolution

Description

Applies the continuous convolution trick, i.e. adding continuous noise to all discrete variables. If a variable should be treated as discrete, declare it as ordered() (passed to expand_as_numeric()).

Usage

cont_conv(x, theta = 0, nu = 5, quasi = TRUE)

Arguments

x

data; numeric matrix or data frame.

theta

scale parameter of the USB distribution (see, dusb()).

nu

smoothness parameter of the USB distribution (see, dusb()). The estimator uses the Epanechnikov kernel for smoothing and the USB for continuous convolution (default parameters correspond to the U[-0.5, 0.5] distribution).

quasi

logical indicating whether quasi random numbers sholuld be used (qrng::ghalton()); only works for theta = 0.

Details

The UPSB distribution (dusb()) is used as the noise distribution. Discrete variables are assumed to be integer-valued.

Value

A data frame with noise added to each discrete variable (ordered columns).

References

Nagler, T. (2017). A generic approach to nonparametric function estimation with mixed data. arXiv:1704.07457

Examples

# dummy data with discrete variables
dat <- data.frame(
    F1 = factor(rbinom(10, 4, 0.1), 0:4),
    Z1 = ordered(rbinom(10, 5, 0.5), 0:5),
    Z2 = ordered(rpois(10, 1), 0:10),
    X1 = rnorm(10),
    X2 = rexp(10)
)

pairs(dat)
pairs(expand_as_numeric(dat))  # expanded variables without noise
pairs(cont_conv(dat))          # continuously convoluted data


[Package cctools version 0.1.2 Index]