kde.local.test {ks} | R Documentation |
Kernel density based local two-sample comparison test
Description
Kernel density based local two-sample comparison test for 1- to 6-dimensional data.
Usage
kde.local.test(x1, x2, H1, H2, h1, h2, fhat1, fhat2, gridsize, binned,
bgridsize, verbose=FALSE, supp=3.7, mean.adj=FALSE, signif.level=0.05,
min.ESS, xmin, xmax)
Arguments
x1 , x2 |
vector/matrix of data values |
H1 , H2 , h1 , h2 |
bandwidth matrices/scalar bandwidths. If these are missing, |
fhat1 , fhat2 |
objects of class |
binned |
flag for binned estimation |
gridsize |
vector of grid sizes |
bgridsize |
vector of binning grid sizes |
verbose |
flag to print out progress information. Default is FALSE. |
supp |
effective support for normal kernel |
mean.adj |
flag to compute second order correction for mean value of critical sampling distribution. Default is FALSE. Currently implemented for d<=2 only. |
signif.level |
significance level. Default is 0.05. |
min.ESS |
minimum effective sample size. See below for details. |
xmin , xmax |
vector of minimum/maximum values for grid |
Details
The null hypothesis is H_0(\bold{x}): f_1(\bold{x}) = f_2(\bold{x})
where f_1, f_2
are the respective density functions. The measure of discrepancy is
U(\bold{x}) = [f_1(\bold{x}) - f_2(\bold{x})]^2
.
Duong (2013) shows that the test statistic obtained, by substituting the
KDEs for the true densities, has a null
distribution which is asymptotically chi-squared with 1 d.f.
The required input is either x1,x2
and H1,H2
, or
fhat1,fhat2
, i.e. the data values and bandwidths or objects of class
kde
. In the former case, the kde
objects are created.
If the H1,H2
are missing then the default are the plug-in
selectors Hpi
. Likewise for missing h1,h2
.
The mean.adj
flag determines whether the
second order correction to the mean value of the test statistic should be computed.
min.ESS
is borrowed from Godtliebsen et al. (2002)
to reduce spurious significant results in the tails, though by it is usually
not required for small to moderate sample sizes.
Value
A kernel two-sample local significance is an object of class
kde.loctest
which is a list with fields:
fhat1 , fhat2 |
kernel density estimates, objects of class |
chisq |
chi squared test statistic |
pvalue |
matrix of local |
fhat.diff |
difference of KDEs |
mean.fhat.diff |
mean of the test statistic |
var.fhat.diff |
variance of the test statistic |
fhat.diff.pos |
binary matrix to indicate locally significant fhat1 > fhat2 |
fhat.diff.neg |
binary matrix to indicate locally significant fhat1 < fhat2 |
n1 , n2 |
sample sizes |
H1 , H2 , h1 , h2 |
bandwidth matrices/scalar bandwidths |
References
Duong, T. (2013) Local significant differences from non-parametric two-sample tests. Journal of Nonparametric Statistics, 25, 635-645.
Godtliebsen, F., Marron, J.S. & Chaudhuri, P. (2002) Significance in scale space for bivariate density estimation. Journal of Computational and Graphical Statistics, 11, 1-22.
See Also
Examples
data(crabs, package="MASS")
x1 <- crabs[crabs$sp=="B", 4]
x2 <- crabs[crabs$sp=="O", 4]
loct <- kde.local.test(x1=x1, x2=x2)
plot(loct, ylim=c(-0.08,0.12))
cols <- hcl.colors(palette="Dark2",2)
plot(loct$fhat1, add=TRUE, col=cols[1])
plot(loct$fhat2, add=TRUE, col=cols[2])
## see examples in ? plot.kde.loctest