CPC {CPC} | R Documentation |
Cluster-Polarization Coefficient
Description
Implements clustering algorithms and calculates cluster-polarization coefficient. Contains support for hierarchical clustering, k-means clustering, partitioning around medoids, density-based spatial clustering with noise, and manual assignment of cluster membership.
Usage
CPC(
data,
type,
k = NULL,
epsilon = NULL,
model = FALSE,
adjust = FALSE,
cols = NULL,
clusters = NULL,
...
)
Arguments
data |
a numeric vector or |
type |
a character string giving the type of clustering method to be used. See Details. |
k |
the desired number of clusters. Required if |
epsilon |
radius of epsilon neighborhood. Required if |
model |
a logical indicating whether clustering model output should be
returned. Defaults to |
adjust |
a logical indicating whether the adjusted CPC should be calculated.
Defaults to |
cols |
columns of |
clusters |
column of |
... |
arguments passed to other functions. |
Details
type
must take one of six values:
"hclust"
: agglomerative hierarchical clustering with hclust()
,
"diana"
: divisive hierarchical clustering with diana()
,
"kmeans"
: k-means clustering with kmeans()
,
"pam"
: k-medoids clustering with pam()
,
"dbscan"
: density-based clustering with dbscan()
,
"manual"
: no clustering is necessary, researcher has specified cluster assignments.
For all clustering methods, additional arguments to fine-tune clustering
performance, such as the specific algorithm to be used, should be passed to
CPC()
and will be inherited by the specified clustering function. In
particular, if type = "kmeans"
, using a large number of random starts is
recommended. This can be specified with the nstart
argument to
kmeans()
, passed directly to CPC()
.
If type = "manual"
, data
must contain a vector identifying cluster
membership for each observation, and cols
and clusters
must be
defined.
Value
If model = TRUE
, CPC()
returns a list with components
containing output from the specified clustering function, all sums of squares, the
CPC, the adjusted CPC, and associated standard errors. If model = FALSE
, CPC()
returns
a numeric vector of length 1 giving the CPC (if adjust = FALSE
) or adjusted CPC (if
adjust = TRUE
).
Examples
data <- matrix(c(rnorm(50, 0, 1), rnorm(50, 5, 1)), ncol = 2, byrow = TRUE)
clusters <- matrix(c(rep(1, 25), rep(2, 25)), ncol = 1)
data <- cbind(data, clusters)
CPC(data[,c(1:2)], "kmeans", k = 2)
CPC(data, "manual", cols = 1:2, clusters = 3)