wkde {bda}R Documentation

Compute a Binned Kernel Density Estimate for Weighted Data

Description

Returns x and y coordinates of the binned kernel density estimate of the probability density of the weighted data.

Usage

 wkde(x, w, bandwidth, freq=FALSE, gridsize = 401L, range.x, 
 	 truncate = TRUE, na.rm = TRUE)

Arguments

x

vector of observations from the distribution whose density is to be estimated. Missing values are not allowed.

w

The weights of x. The weight w_i of any observation x_i should be non-negative. If x_i=0, x_i will be removed from the analysis.

bandwidth

the kernel bandwidth smoothing parameter. Larger values of bandwidth make smoother estimates, smaller values of bandwidth make less smooth estimates. Automatic bandwidth selectors are developed. Options include wnrd, wnrd0, wmise,blscv, and awmise.

freq

An indicator showing whether w is a vector of frequecies (counts) or weights.

gridsize

the number of equally spaced points at which to estimate the density.

range.x

vector containing the minimum and maximum values of x at which to compute the estimate. The default is the minimum and maximum data values, extended by the support of the kernel.

truncate

logical flag: if TRUE, data with x values outside the range specified by range.x are ignored.

na.rm

logical flag: if TRUE, NA values will be ignored; otherwise, the program will be halted with error information.

Details

The default bandwidth, "wnrd0", is computed using a rule-of-thumb for choosing the bandwidth of a Gaussian kernel density estimator based on weighted data. It defaults to 0.9 times the minimum of the standard deviation and the interquartile range divided by 1.34 times the sample size to the negative one-fifth power (= Silverman's ‘rule of thumb’, Silverman (1986, page 48, eqn (3.31)) _unless_ the quartiles coincide when a positive result will be guaranteed.

"wnrd" is the more common variation given by Scott (1992), using factor 1.06.

"wmise" is a completely automatic optimal bandwidth selector using the least-squares cross-validation (LSCV) method by minimizing the integrated squared errors (ISE).

Value

a list containing the following components:

x

vector of sorted x values at which the estimate was computed.

y

vector of density estimates at the corresponding x.

bw

optimal bandwidth.

sp

sensitivity parameter, none NA if adaptive bandwidth selector is used.

References

Wand, M. P. and Jones, M. C. (1995). Kernel Smoothing. Chapman and Hall, London.

Examples


 mu = 34.5; s=1.5; n = 3000
 x = round(rnorm(n, mu, s),1)
 x0 = seq(min(x)-s,max(x)+s, length=100)
 f0 = dnorm(x0,mu, s)

 xt = table(x); n = length(x)
 x1 = as.numeric(names(xt))
 w1 = as.numeric(xt)
 (h1 <- bw.wnrd0(x1, w1))
 (h2 <- bw.wnrd0(x1,w1,n=n))
 
 est1 <- wkde(x1,w1, bandwidth=h1)
 est2 <- wkde(x1,w1, bandwidth=h2)
 est3 <- wkde(x1,w1, bandwidth='awmise')
 est4 <- wkde(x1,w1, bandwidth='wmise')
 est5 <- wkde(x1,w1, bandwidth='blscv')

 est0 = density(x1,bw="SJ",weights=w1/sum(w1)); 

 plot(f0~x0, xlim=c(min(x),max(x)), ylim=c(0,.30), type="l")
 lines(est0, col=2, lty=2, lwd=2)

 lines(est1, col=2)
 lines(est2, col=3)
 lines(est3, col=4)
 lines(est4, col=5)
 lines(est5, col=6)
 legend(max(x),.3,xjust=1,yjust=1,cex=.8,
  legend=c("N(34.5,1.5)", "SJ", "wnrd0",
  "wnrd0(n)","awmise","wmise","blscv"),
  col = c(1,2,2,3,4,5,6), lty=c(1,2,1,1,1,1,1),
  lwd=c(1,2,1,1,1,1,1))


[Package bda version 18.2.2 Index]