assign.cluster {greenclust} | R Documentation |
Assign clusters to a new vector of categories
Description
Maps a vector of cluster numbers to another categorical vector,
yielding a new vector of matching cluster numbers. Useful for distributing
cluster numbers back out to the original observations in cases where the
clustering was performed on a table of unique levels rather than directly
on the observations (such as with greenclust
).
Usage
assign.cluster(x, clusters, impute = FALSE)
Arguments
x |
a factor or character vector representing a categorical variable |
clusters |
a named numeric vector of cluster numbers, such as an
object returned by |
impute |
a boolean controlling the behavior when a value in |
Details
Any categories in x
that do not exist in names(clusters)
are given a cluster of NA
, or (if impute
is TRUE
)
assigned the cluster number that is most-frequently used for the other
existing categories, with ties going to the lowest cluster number. If
there are no matching clusters for any of the categories in x
,
imputation will simply use the first cluster number in clusters
.
If there are duplicate names in clusters
, the first occurrence
takes precedence.
Value
A factor vector of the same length as x
, representing
assigned cluster numbers.
See Also
greenclust
, greencut
,
greenplot
Examples
# Cluster feed types based on number of "underweight" chicks
grc <- greenclust(table(chickwts$feed,
ifelse(chickwts$weight < 200, "Y", "N")))
# Assign clusters to each original observation
feed.clustered <- assign.cluster(chickwts$feed, greencut(grc))
table(chickwts$feed, feed.clustered)