checkHistogram {essHist} | R Documentation |
Check any histogram estimator by means of the multiscale confidence set
Description
Provide the locations, i.e., intervals, where features are potentially missing (a.k.a. false negatives), and the break-points that are potentially redundant (a.k.a. false positives), by means of the multiscale confidence set.
Usage
checkHistogram(h, x, alpha = 0.1, q = NULL, intv = NULL,
mode = ifelse(anyDuplicated(x),"Gen","Con"),
plot = TRUE, xlim = NULL, ylim = NULL,
xlab = "", ylab = "", yaxt = "n", ...)
Arguments
h |
a numeric vector specifying values of a histogram at sample points; or a |
x |
a numeric vector containing the data. |
alpha |
significance level, default as 0.1, see also |
q |
threshold of the multiscale constraint; by default, |
intv |
a data frame provides the system of intervals on which the multiscale statistic is defined. The data frame constains the following two columns
By default, it is set to the sparse interval system proposed by Rivera and Walther (2013), see also Li et al. (2016). |
mode |
By default, |
plot |
logical. If |
xlim , ylim |
numeric vectors of length 2 (default |
xlab |
a title for the |
ylab |
a title for the |
yaxt |
A character which specifies the |
... |
further arguments and |
Details
This function presents a visualization: the upper part plots the given histogram; in the middle part short vertical lines mark all removable break-points; in the lower part intervals of violation are shown, and a graybar below the middle horizontal line (blue) sumarizes such violations with the darkness scaling with the number of violation intervals covering a location. See Examples below and Li et al. (2016) for further details.
Value
A list consists of one data frame, and one numeric vector:
violatedIntervals |
A data frame provides the intervals where the corresponding local side constraint is violated; an empty data frame if there is no violation. It constains the following four columns
An empty |
removableBreakpoints |
A numeric vector contains all removable breakpoints, with zero length if there is no removable breakpoint. |
Note
The argument intv
is internally adjusted ensure it contains no empty intervals in case of tied observations. Only the intervals on which the input histogram is constant will be checked! All the printing messages can be disabled by calling suppressMessages
.
References
Li, H., Munk, A., Sieling, H., and Walther, G. (2016). The essential histogram. arXiv:1612.07216.
See Also
essHistogram
,
genIntv
,
msQuantile
Examples
set.seed(123)
# Data: mixture of Gaussians "harp"
n = 500
y = rmixnorm(n, type = 'harp')
# Oracle density
x = sort(y)
ho = dmixnorm(x, type = 'harp')
# R default histogram
h = hist(y, plot = FALSE)
# Check R default histogram to local multiscale constriants
b = checkHistogram(h, y, ylim=c(-0.1,0.16))
lines(x, ho, col = "red")
rug(x, col = 'blue')
legend("topright", c("R-Histogram", "Truth"), col = c("black", "red"), lty = c(1,1))