check_clustering {scclust}R Documentation

Check clustering constraints

Description

check_clustering checks whether a clustering satisfies constraints on the size and composition of the clusters.

Usage

check_clustering(
  clustering,
  size_constraint = NULL,
  type_labels = NULL,
  type_constraints = NULL,
  primary_data_points = NULL
)

Arguments

clustering

a scclust object containing a non-empty clustering.

size_constraint

an integer with the required minimum cluster size. If NULL, only the type constraints will be checked.

type_labels

a vector containing the type of each data point. May be NULL when type_constraints is NULL.

type_constraints

a named integer vector containing type-specific size constraints. If NULL, only the overall constraint will be checked.

primary_data_points

a vector specifying primary data points, either by point indices or with a logical vector of length equal to the number of points. check_clustering checks so all primary data points are assigned to a cluster. NULL indicates that no such check should be done.

Value

Returns TRUE if clustering satisfies the constraints, and FALSE if it does not. Throws an error if clustering is an invalid instance of the scclust class.

See Also

See sc_clustering for details on how to specify the type_labels and type_constraints parameters.

Examples

# Example scclust clustering
my_scclust <- scclust(c("A", "A", "B", "C", "B",
                        "C", "C", "A", "B", "B"))


# Check so each cluster contains at least two data points
check_clustering(my_scclust, 2)
# > TRUE


# Check so each cluster contains at least four data points
check_clustering(my_scclust, 4)
# > FALSE


# Data point types
my_types <- factor(c("x", "y", "y", "z", "z",
                     "x", "y", "z", "x", "x"))


# Check so each cluster contains at least one point of each type
check_clustering(my_scclust,
                 NULL,
                 my_types,
                 c("x" = 1, "y" = 1, "z" = 1))
# > TRUE


# Check so each cluster contains one data point of both "x" and "z"
# and at least three points in total
check_clustering(my_scclust,
                 3,
                 my_types,
                 c("x" = 1, "z" = 1))
# > TRUE


# Check so each cluster contains five data points of type "y"
check_clustering(my_scclust,
                 NULL,
                 my_types,
                 c("y" = 5))
# > FALSE


[Package scclust version 0.2.4 Index]