| assign.cluster {greenclust} | R Documentation |
Assign clusters to a new vector of categories
Description
Maps a vector of cluster numbers to another categorical vector,
yielding a new vector of matching cluster numbers. Useful for distributing
cluster numbers back out to the original observations in cases where the
clustering was performed on a table of unique levels rather than directly
on the observations (such as with greenclust).
Usage
assign.cluster(x, clusters, impute = FALSE)
Arguments
x |
a factor or character vector representing a categorical variable |
clusters |
a named numeric vector of cluster numbers, such as an
object returned by |
impute |
a boolean controlling the behavior when a value in |
Details
Any categories in x that do not exist in names(clusters)
are given a cluster of NA, or (if impute is TRUE)
assigned the cluster number that is most-frequently used for the other
existing categories, with ties going to the lowest cluster number. If
there are no matching clusters for any of the categories in x,
imputation will simply use the first cluster number in clusters.
If there are duplicate names in clusters, the first occurrence
takes precedence.
Value
A factor vector of the same length as x, representing
assigned cluster numbers.
See Also
greenclust, greencut,
greenplot
Examples
# Cluster feed types based on number of "underweight" chicks
grc <- greenclust(table(chickwts$feed,
ifelse(chickwts$weight < 200, "Y", "N")))
# Assign clusters to each original observation
feed.clustered <- assign.cluster(chickwts$feed, greencut(grc))
table(chickwts$feed, feed.clustered)