methods-kmeans {ordr}R Documentation

Functionality for k-means clustering ('kmeans') objects

Description

These methods extract data from, and attribute new data to, objects of class "kmeans" as returned by stats::kmeans().

Usage

## S3 method for class 'kmeans'
as_tbl_ord(x)

## S3 method for class 'kmeans'
recover_rows(x)

## S3 method for class 'kmeans'
recover_cols(x)

## S3 method for class 'kmeans'
recover_coord(x)

## S3 method for class 'kmeans'
recover_aug_rows(x)

## S3 method for class 'kmeans'
recover_aug_cols(x)

## S3 method for class 'kmeans'
recover_aug_coord(x)

Arguments

x

An ordination object.

Value

The recovery generics ⁠recover_*()⁠ return core model components, distribution of inertia, supplementary elements, and intrinsic metadata; but they require methods for each model class to tell them what these components are.

The generic as_tbl_ord() returns its input wrapped in the 'tbl_ord' class. Its methods determine what model classes it is allowed to wrap. It then provides 'tbl_ord' methods with access to the recoverers and hence to the model components.

See Also

Other methods for idiosyncratic techniques: methods-lm

Other models from the stats package: methods-cancor, methods-cmds, methods-factanal, methods-lm, methods-prcomp, methods-princomp

Examples

# data frame of Anderson iris species measurements
class(iris)
head(iris)
# compute 3-means clustering on scaled iris measurements
set.seed(5601L)
iris %>%
  subset(select = -Species) %>%
  scale() %>%
  kmeans(centers = 3) %>%
  print() -> iris_km

# visualize clusters using PCA
iris %>%
  subset(select = -Species) %>%
  prcomp() %>%
  as_tbl_ord() %>%
  mutate_rows(cluster = iris_km$cluster) %>%
  ggbiplot() +
  geom_rows_point(aes(color = factor(as.character(as.integer(cluster)),
                                     levels = as.character(seq(3L))))) +
  scale_color_brewer(type = "qual", name = "cluster")

# wrap as a 'tbl_ord' object
(iris_km_ord <- as_tbl_ord(iris_km))

# augment everything with names, observations with cluster assignment
(iris_km_ord <- augment_ord(iris_km_ord))

# summarize clusters with standard deviation
iris_km_ord %>%
  tidy() %>%
  transform(sdev = sqrt(withinss / size))

# discriminate between clusters 2 and 3
iris_km_ord %>%
  ggbiplot(aes(x = `2`, y = `3`), color = factor(.cluster)) +
  geom_jitter(stat = "rows", aes(shape = cluster), width = .2, height = .2) +
  geom_cols_axis(aes(color = `1`, label = name),
                 text_size = 2, text_dodge = .1,
                 label_size = 3, label_alpha = .5) +
  scale_x_continuous(expand = expansion(mult = .8)) +
  scale_y_continuous(expand = expansion(mult = .5)) +
  ggtitle(
    "Measurement loadings onto clusters 2 and 3",
    "Color indicates loadings onto cluster 1"
  )

[Package ordr version 0.1.1 Index]