sNN {dbscan} | R Documentation |
Find Shared Nearest Neighbors
Description
Calculates the number of shared nearest neighbors and creates a shared nearest neighbors graph.
Usage
sNN(
x,
k,
kt = NULL,
jp = FALSE,
sort = TRUE,
search = "kdtree",
bucketSize = 10,
splitRule = "suggest",
approx = 0
)
## S3 method for class 'sNN'
sort(x, decreasing = TRUE, ...)
## S3 method for class 'sNN'
print(x, ...)
Arguments
x |
|
k |
number of neighbors to consider to calculate the shared nearest neighbors. |
kt |
minimum threshold on the number of shared nearest neighbors to
build the shared nearest neighbor graph. Edges are only preserved if
|
jp |
In regular sNN graphs, two points that are not neighbors
can have shared neighbors.
Javis and Patrick (1973) requires the two points to be neighbors, otherwise
the count is zeroed out. |
sort |
sort by the number of shared nearest neighbors? Note that this
is expensive and |
search |
nearest neighbor search strategy (one of |
bucketSize |
max size of the kd-tree leafs. |
splitRule |
rule to split the kd-tree. One of |
approx |
use approximate nearest neighbors. All NN up to a distance of
a factor of |
decreasing |
logical; sort in decreasing order? |
... |
additional parameters are passed on. |
Details
The number of shared nearest neighbors of two points p and q is the
intersection of the kNN neighborhood of two points.
Note: that each point is considered to be part
of its own kNN neighborhood.
The range for the shared nearest neighbors is
[0, k]
. The result is a n-by-k matrix called shared
.
Each row is a point and the columns are the point's k nearest neighbors.
The value is the count of the shared neighbors.
The shared nearest neighbor graph connects a point with all its nearest neighbors
if they have at least one shared neighbor. The number of shared neighbors can be used
as an edge weight.
Javis and Patrick (1973) use a slightly
modified (see parameter jp
) shared nearest neighbor graph for
clustering.
Value
An object of class sNN
(subclass of kNN and NN) containing a list
with the following components:
id |
a matrix with ids. |
dist |
a matrix with the distances. |
shared |
a matrix with the number of shared nearest neighbors. |
k |
number of |
metric |
the used distance metric. |
Author(s)
Michael Hahsler
References
R. A. Jarvis and E. A. Patrick. 1973. Clustering Using a Similarity Measure Based on Shared Near Neighbors. IEEE Trans. Comput. 22, 11 (November 1973), 1025-1034. doi:10.1109/T-C.1973.223640
See Also
Other NN functions:
NN
,
comps()
,
frNN()
,
kNN()
,
kNNdist()
Examples
data(iris)
x <- iris[, -5]
# finding kNN and add the number of shared nearest neighbors.
k <- 5
nn <- sNN(x, k = k)
nn
# shared nearest neighbor distribution
table(as.vector(nn$shared))
# explore number of shared points for the k-neighborhood of point 10
i <- 10
nn$shared[i,]
plot(nn, x)
# apply a threshold to create a sNN graph with edges
# if more than 3 neighbors are shared.
nn_3 <- sNN(nn, kt = 3)
plot(nn_3, x)
# get an adjacency list for the shared nearest neighbor graph
adjacencylist(nn_3)