wwcss {WCluster}R Documentation

Weighted Within Cluster Sum of Squares

Description

This function computes the weighted within cluster sum of squares (WWCSS) for a set of cluster assignments provided to a dataset with observational weights.

Usage

wwcss(x, cl, w = rep(1,length(x)), groupSum = FALSE)

Arguments

x

A data matrix (data frame, data table, matrix, etc.) containing only entries of class numeric.

cl

Vector of length nrow(x) of cluster assignments for each observation in the dataset, indicating the cluster to which each observation is allocated. Must be of class integer.

w

Vector of length nrow(x) of weights for each observation in the dataset. Must be of class numeric or integer. If NULL, the default value is a vector of 1 with length nrow(x), i.e., weights equal 1 for all observations.

groupSum

A logical value indicating whether the weighted within-cluster sum of squres (WWCSS) of each cluster should be returned. If TRUE the total WWCSS and WWCSS for each cluster are returned. If FALSE (the default) only the total WWCSS is returned.

Details

This function is used to evaluate clustering results for observations with weights, and also used for optimizing the cluster assignments in the Wkmeans function.

Value

A list containing the following components:

WWCSS

If requested by groupSum, vector of individual WWCSS's for each cluster

TotalWWCSS

Combined sum of all individual WWCSS's.

Author(s)

Javier Cabrera, Yajie Duan, Ge Cheng

References

Cherasia, K. E., Cabrera, J., Fernholz, L. T., & Fernholz, R. (2022). Data Nuggets in Supervised Learning. In Robust and Multivariate Statistical Methods: Festschrift in Honor of David E. Tyler (pp. 429-449). Cham: Springer International Publishing.

Beavers, T., Cheng, G., Duan, Y., Cabrera, J., Lubomirski, M., Amaratunga, D., Teigler, J. (2023). Data Nuggets: A Method for Reducing Big Data While Preserving Data Structure (Submitted for Publication)

See Also

Wkmeans

Examples


    require(cluster)
    # The Ruspini data set from the package "cluster""
    x = as.matrix(ruspini)

    # assign random weights to observations
    w = sample(1:10,nrow(x),replace = TRUE)

    # assign random clusters to observations
    cl = sample(1:3,nrow(x),replace = TRUE)

    #output the total WWCSS and WWCSS for each cluster for the cluster assignments
    wwcss(x, cl, w, groupSum = TRUE)


[Package WCluster version 1.2.0 Index]