qkdbscan {qkerntool} | R Documentation |
qKernel-DBSCAN density reachability and connectivity clustering
Description
Similiar to the Density-Based Spatial Clustering of Applications with Noise(or DBSCAN) algorithm, qKernel-DBSCAN is a density-based clustering algorithm that can be applied under both linear and non-linear situations.
Usage
## S4 method for signature 'matrix'
qkdbscan(x, kernel = "rbfbase", qpar = list(sigma = 0.1, q = 0.9),
eps = 0.25, MinPts = 5, hybrid = TRUE, seeds = TRUE, showplot = FALSE,
countmode = NULL, na.action = na.omit, ...)
## S4 method for signature 'cndkernmatrix'
qkdbscan(x, eps = 0.25, MinPts = 5, seeds = TRUE,
showplot = FALSE, countmode = NULL, ...)
## S4 method for signature 'qkernmatrix'
qkdbscan(x, eps = 0.25, MinPts = 5, seeds = TRUE,
showplot = FALSE, countmode = NULL, ...)
## S4 method for signature 'qkdbscan'
predict(object, data, newdata = NULL, predict.max = 1000, ...)
Arguments
x |
the data matrix indexed by row, or a kernel matrix of |
kernel |
the kernel function used in training and predicting. This parameter can be set to any function, of class kernel, which computes a kernel function value between two vector arguments. qkerntool provides the most popular kernel functions which can be used by setting the kernel parameter to the following strings:
The kernel parameter can also be set to a user defined function of class kernel by passing the function name as an argument. |
qpar |
the list of hyper-parameters (kernel parameters). This is a list which contains the parameters to be used with the kernel function. Valid parameters for existing kernels are :
Hyper-parameters for user defined kernels can be passed through the qpar parameter as well. |
eps |
reachability distance, see Ester et al. (1996). (default:0.25) |
MinPts |
reachability minimum number of points, see Ester et al.(1996).(default : 5) |
hybrid |
whether the algothrim expects raw data but calculates partial distance matrices, can be TRUE or FALSE |
seeds |
can be TRUE or FALSE, FALSE to not include the |
showplot |
whether to show the plot or not, can be TRUE or FALSE |
na.action |
a function to specify the action to be taken if |
countmode |
NULL or vector of point numbers at which to report progress. |
object |
object of class |
data |
matrix or data.frame. |
newdata |
matrix or data.frame with raw data to predict. |
predict.max |
max. batch size for predictions. |
... |
Further arguments transferred to plot methods. |
Details
The data can be passed to the qkdbscan
function in a matrix
, in addition qkdbscan
also supports input in the form of a kernel matrix of class qkernmatrix
or class cndkernmatrix
.
Value
predict
(qkdbscan-method) gives out a vector of predicted clusters for the points in newdata
.
qkdbscan
gives out an S4 object which is a LIST with components
clust |
integer vector coding cluster membership with noise observations (singletons) coded as 0 |
eps |
parameter eps |
MinPts |
parameter MinPts |
kcall |
the function call |
cndkernf |
the kernel function used |
xmatrix |
the original data matrix |
all the slots of the object can be accessed by accessor functions.
Note
The predict function can be used to embed new data on the new space.
Author(s)
Yusen Zhang
yusenzhang@126.com
References
Martin Ester, Hans-Peter Kriegel, Joerg Sander, Xiaowei Xu(1996).
A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise
Institute for Computer Science, University of Munich.
Proceedings of 2nd International Conference on Knowledge Discovery and Data Mining (KDD-96)
See Also
qkernmatrix
, cndkernmatrix
Examples
# a simple example using the iris
data(iris)
test <- sample(1:150,20)
x<- as.matrix(iris[-test,-5])
ds <- qkdbscan (x,kernel="laplbase",qpar=list(sigma=3.5,q=0.8),eps=0.15,
MinPts=5,hybrid = FALSE)
plot(ds,x)
emb <- predict(ds, x, as.matrix(iris[test,-5]))
points(iris[test,], col= as.integer(1+emb))