pointdensity {dbscan}  R Documentation 
Calculate the local density at each data point as either the number of
points in the epsneighborhood (as used in dbscan()
) or perform kernel density
estimation (KDE) using a uniform kernel. The function uses a kdtree for fast
fixedradius nearest neighbor search.
pointdensity(
x,
eps,
type = "frequency",
search = "kdtree",
bucketSize = 10,
splitRule = "suggest",
approx = 0
)
x 
a data matrix. 
eps 
radius of the epsneighborhood, i.e., bandwidth of the uniform kernel). 
type 

search, bucketSize, splitRule, approx 
algorithmic parameters for

dbscan()
estimates the density around a point as the number of points in the
epsneighborhood of the point (including the query point itself).
Kernel density estimation (KDE) using a uniform kernel, which is just this point
count in the epsneighborhood divided by (2\,eps\,n)
, where
n
is the number of points in x
.
Points with low local density often indicate noise (see e.g., Wishart (1969) and Hartigan (1975)).
A vector of the same length as data points (rows) in x
with
the count or density values for each data point.
Michael Hahsler
Wishart, D. (1969), Mode Analysis: A Generalization of Nearest Neighbor which Reduces Chaining Effects, in Numerical Taxonomy, Ed., A.J. Cole, Academic Press, 282311.
John A. Hartigan (1975), Clustering Algorithms, John Wiley & Sons, Inc., New York, NY, USA.
Other Outlier Detection Functions:
glosh()
,
kNNdist()
,
lof()
set.seed(665544)
n < 100
x < cbind(
x=runif(10, 0, 5) + rnorm(n, sd = 0.4),
y=runif(10, 0, 5) + rnorm(n, sd = 0.4)
)
plot(x)
### calculate density
d < pointdensity(x, eps = .5, type = "density")
### density distribution
summary(d)
hist(d, breaks = 10)
### plot with point size is proportional to Density
plot(x, pch = 19, main = "Density (eps = .5)", cex = d*5)
### Wishart (1969) single link clustering after removing lowdensity noise
# 1. remove noise with low density
f < pointdensity(x, eps = .5, type = "frequency")
x_nonoise < x[f >= 5,]
# 2. use singlelinkage on the nonnoise points
hc < hclust(dist(x_nonoise), method = "single")
plot(x, pch = 19, cex = .5)
points(x_nonoise, pch = 19, col= cutree(hc, k = 4) + 1L)