R: Distribution Matching for Source and Reference Datasets

dist_match {DMTL}

R Documentation

Distribution Matching for Source and Reference Datasets

Description

This function matches a source distribution to a given reference distribution such that the data in the source space can effectively be transferred to the reference space i.e. domain transfer via distribution matching.

Usage

dist_match(
  src,
  ref,
  src_cdf,
  ref_cdf,
  lims,
  density = FALSE,
  samples = 1e+06,
  seed = NULL
)

Arguments

`src`	Vector containing the source data to be matched.
`ref`	Vector containing the reference data to estimate the reference distribution for matching.
`src_cdf`	Vector containing source distribution values. If missing, these values are estimated from the source data using `estimate_cdf()`.
`ref_cdf`	Vector containing reference distribution values. If missing, these values are estimated from the reference data using `estimate_cdf()`.
`lims`	Vector providing the range of the knot values for mapping. If missing, these values are estimated from the reference data.
`density`	Flag for using kernel density estimates for matching instead of histogram counts. Defaults to `False`.
`samples`	Sample size for estimating distributions if `src_cdf` and/or `ref_cdf` are missing. Defaults to `1e6`.
`seed`	Seed for random number generator (for reproducible outcomes). Defaults to `NULL`.

Value

A vector containing the matched values corresponding to src.

Examples

set.seed(7531)
x1 <- rnorm(100, 0.2, 0.6)
x2 <- runif(200)
matched <- dist_match(src = x1, ref = x2, lims = c(0, 1))

## Plot histograms...
opar <- par(mfrow = c(1, 3))
hist(x1);    hist(x2);    hist(matched)
par(opar)              # Reset par

[Package DMTL version 0.1.2 Index]