CPC {CPC} | R Documentation |
Implements clustering algorithms and calculates cluster-polarization coefficient. Contains support for hierarchical clustering, k-means clustering, partitioning around medoids, density-based spatial clustering with noise, and manual assignment of cluster membership.
CPC(
data,
type,
k = NULL,
epsilon = NULL,
model = FALSE,
adjust = FALSE,
cols = NULL,
clusters = NULL,
...
)
data |
a numeric vector or |
type |
a character string giving the type of clustering method to be used. See Details. |
k |
the desired number of clusters. Required if |
epsilon |
radius of epsilon neighborhood. Required if |
model |
a logical indicating whether clustering model output should be
returned. Defaults to |
adjust |
a logical indicating whether the adjusted CPC should be calculated.
Defaults to |
cols |
columns of |
clusters |
column of |
... |
arguments passed to other functions. |
type
must take one of five values: "hclust"
performs agglomerative
hierarchical clustering via hclust()
. "kmeans"
performs k-means clustering via kmeans()
. "pam"
performs k-medoids clustering via pam()
. "dbscan"
performs
density-based clustering via dbscan()
. "manual"
indicates
that no clustering is necessary and that the researcher has specified cluster
assignments.
For all clustering methods, additional arguments to fine-tune clustering
performance, such as the specific algorithm to be used, should be passed to
CPC()
and will be inherited by the specified clustering function. In
particular, if type = "kmeans"
, using a large number of random starts is
recommended. This can be specified with the nstart
argument to
kmeans()
, passed directly to CPC()
.
If type = "manual"
, data
must contain a vector identifying cluster
membership for each observation, and cols
and clusters
must be
defined.
If model = TRUE
, CPC()
returns a list with components
containing output from the specified clustering function, all sums of squares,
CPC, and adjusted CPC. If model = FALSE
, CPC()
returns a numeric
vector of length 1 giving the CPC (if adjust = FALSE
) or adjusted CPC (if
adjust = TRUE
).
data <- matrix(c(rnorm(50, 0, 1), rnorm(50, 5, 1)), ncol = 2, byrow = TRUE)
clusters <- matrix(c(rep(1, 25), rep(2, 25)), ncol = 1)
data <- cbind(data, clusters)
CPC(data[,c(1:2)], "kmeans", k = 2)
CPC(data, "manual", cols = 1:2, clusters = 3)