CPC {CPC}R Documentation

Cluster-Polarization Coefficient


Implements clustering algorithms and calculates cluster-polarization coefficient. Contains support for hierarchical clustering, k-means clustering, partitioning around medoids, density-based spatial clustering with noise, and manual assignment of cluster membership.


  k = NULL,
  epsilon = NULL,
  model = FALSE,
  adjust = FALSE,
  cols = NULL,
  clusters = NULL,



a numeric vector or n x k matrix or data frame. If type = "manual", data must be a matrix containing a vector identifying cluster membership for each observation, to be passed to clusters argument.


a character string giving the type of clustering method to be used. See Details.


the desired number of clusters. Required if type = "hclust", type = "kmeans", or type = "pam".


radius of epsilon neighborhood. Required if type = "dbscan".


a logical indicating whether clustering model output should be returned. Defaults to FALSE.


a logical indicating whether the adjusted CPC should be calculated. Defaults to FALSE. Note that both CPC and adjusted CPC are automatically calculated and returned if model = TRUE.


columns of data to be used in CPC calculation. Only used if type = "manual".


column of data indicating cluster membership for each observation. Only used if type = "manual".


arguments passed to other functions.


type must take one of five values: "hclust" performs agglomerative hierarchical clustering via hclust(). "kmeans" performs k-means clustering via kmeans(). "pam" performs k-medoids clustering via pam(). "dbscan" performs density-based clustering via dbscan(). "manual" indicates that no clustering is necessary and that the researcher has specified cluster assignments.

For all clustering methods, additional arguments to fine-tune clustering performance, such as the specific algorithm to be used, should be passed to CPC() and will be inherited by the specified clustering function. In particular, if type = "kmeans", using a large number of random starts is recommended. This can be specified with the nstart argument to kmeans(), passed directly to CPC().

If type = "manual", data must contain a vector identifying cluster membership for each observation, and cols and clusters must be defined.


If model = TRUE, CPC() returns a list with components containing output from the specified clustering function, all sums of squares, CPC, and adjusted CPC. If model = FALSE, CPC() returns a numeric vector of length 1 giving the CPC (if adjust = FALSE) or adjusted CPC (if adjust = TRUE).


data <- matrix(c(rnorm(50, 0, 1), rnorm(50, 5, 1)), ncol = 2, byrow = TRUE)
clusters <- matrix(c(rep(1, 25), rep(2, 25)), ncol = 1)
data <- cbind(data, clusters)

CPC(data[,c(1:2)], "kmeans", k = 2)
CPC(data, "manual", cols = 1:2, clusters = 3)

[Package CPC version 2.3.0 Index]