part_kmeans {partition} | R Documentation |
Partitioner: K-means, ICC, scaled means
Description
Partitioners are functions that tell the partition algorithm 1)
what to try to reduce 2) how to measure how much information is lost from
the reduction and 3) how to reduce the data. In partition, functions that
handle 1) are called directors, functions that handle 2) are called
metrics, and functions that handle 3) are called reducers. partition has a
number of pre-specified partitioners for agglomerative data reduction.
Custom partitioners can be created with as_partitioner()
.
Pass partitioner
objects to the partitioner
argument of partition()
.
part_kmeans()
uses the following direct-measure-reduce approach:
-
direct:
direct_k_cluster()
, K-Means Clusters -
measure:
measure_min_icc()
, Minimum Intraclass Correlation -
reduce:
reduce_kmeans()
, Scaled Row Means
Usage
part_kmeans(
algorithm = c("armadillo", "Hartigan-Wong", "Lloyd", "Forgy", "MacQueen"),
search = c("binary", "linear"),
init_k = NULL,
n_hits = 4
)
Arguments
algorithm |
The K-Means algorithm to use. The default is a fast version
of the LLoyd algorithm written in armadillo. The rest are options in
|
search |
The search method. Binary search is generally more efficient but linear search can be faster in very low dimensions. |
init_k |
The initial k to test. If |
n_hits |
In linear search method, the number of iterations that should be under the threshold before reducing; useful for preventing false positives. |
Value
a partitioner
See Also
Other partitioners:
as_partitioner()
,
part_icc()
,
part_minr2()
,
part_pc1()
,
part_stdmi()
,
replace_partitioner()
Examples
set.seed(123)
df <- simulate_block_data(c(3, 4, 5), lower_corr = .4, upper_corr = .6, n = 100)
# fit partition using part_kmeans()
partition(df, threshold = .6, partitioner = part_kmeans())