kepdf {pdfCluster} | R Documentation |
Kernel estimate of a probability density function.
Description
Estimates density of uni- and multivariate data by the kernel method.
Usage
kepdf(x, eval.points = x, kernel = "gaussian",
bwtype = "fixed", h = h.norm(x), hx = NULL, alpha = 1/2)
Arguments
x |
A vector, a matrix or data-frame of data whose density should be estimated. |
eval.points |
A vector, a matrix or a data-frame of data points at which the density estimate should be evaluated. |
kernel |
Either 'gaussian' or 't7', it defines the kernel function to be used. See details below. |
bwtype |
Either 'fixed' or 'adaptive', corresponding to a kernel estimator with fixed or adaptive bandwidths respectively. See details below. |
h |
A vector of length set to |
hx |
A matrix with the same number of rows and columns as |
alpha |
Sensitivity parameter to be given to |
Details
The current version of pdfCluster-package
allows for computing estimates by a kernel product
estimator of the form:
\hat{f}(y)= \sum_{i=1}^n \frac{1}{n h_{i,1} \cdots h_{i,d}} \prod_{j=1}^d K\left(\frac{y_{j} - x_{i,j}}{h_{i,j}}\right).
The kernel function K
can either be a Gaussian density (if kernel = "gaussian"
) or a t_\nu
density, with \nu = 7
degrees of freedom (when kernel = "t7"
).
Although uncommon, the option of selecting a t
kernel is motivated by computational efficiency reasons. Hence, its use is suggested when either x
or eval.points
have a huge number of rows.
The vectors of bandwidths h_{i} = (h_{i,1} \cdots h_{i,d})'
are defined as follows:
- Fixed bandwidth
When
bwtype='fixed'
,h_{i} = h
that is, a constant smoothing vector is used for all the observationsx_i
. Default values are set as asymptotically optimal for a multivariate Normal distribution (e.g., Bowman and Azzalini, 1997). Seeh.norm
for further details.- Adaptive bandwidth
When
bwtype='adaptive'
, a vector of bandwidthsh_i
is specified for each observationx_i
. Default values are selected according to Silverman (1986, Section 5.3.1). Seehprop2f
.
Value
An S4 object of kepdf-class
with slots:
call |
The matched call. |
x |
The data input, coerced to be a matrix. |
eval.points |
The data points at which the density is evaluated. |
estimate |
The values of the density estimate at the evaluation points. |
kernel |
The selected kernel. |
bwtype |
The type of estimator. |
par |
A list of parameters used to estimate the density, with elements:
|
References
Bowman, A.W. and Azzalini, A. (1997). Applied smoothing techniques for data analysis: the kernel approach with S-Plus illustrations. Oxford University Press, Oxford.
Silverman, B. (1986). Density estimation for statistics and data analysis. Chapman and Hall, London.
See Also
Examples
## A 1-dimensional example
data(wine)
x <- wine[,3]
pdf <- kepdf(x, eval.points=seq(0,7,by=.1))
plot(pdf, n.grid= 100, main="wine data")
## A 2-dimensional example
x <- wine[,c(2,8)]
pdf <- kepdf(x)
plot(pdf, main="wine data", props=c(5,50,90), ylim=c(0,4))
plot(pdf, main="wine data", method="perspective", phi=30, theta=60)
### A 3-dimensional example
x <- wine[,c(2,3,8)]
pdf <- kepdf(x)
plot(pdf, main="wine data", props=c(10,50,70), gap=0.2)
plot(pdf, main="wine data", method="perspective", gap=0.2, phi=30, theta=10)
### A 6-dimensional example
### adaptive kernel density estimate is preferable in high-dimensions
x <- wine[,c(2,3,5,7,8,10)]
pdf <- kepdf(x, bwtype="adaptive")
plot(pdf, main="wine data", props=c(10,50,70), gap=0.2)
plot(pdf, main="wine data", method="perspective", gap=0.2, phi=30, theta=10)