start_groups {clustra} | R Documentation |
Function to assign starting groups.
Description
Either a random assignment of k approximately equal size clusters or a FastMap-like algorithm that sequentially selects k distant ids from those that have more than the median number of observations. TPS fits to these ids are used as cluster centers for a starting group assignment. A user supplied starting assignment is also possible.
Usage
start_groups(k, data, starts, maxdf, conv, mccores = 1, verbose = FALSE)
Arguments
k |
Number of clusters (groups). |
data |
Data.table with response measurements, one per observation.
Column names are id, time, response, group. Note that |
starts |
Type of start groups generated. See |
maxdf |
Fitting parameters. See |
conv |
Fitting parameters. See |
mccores |
See |
verbose |
Turn on more output for debugging. Values 0, 1, 2, 3 add more output. 2 and 3 produce graphs during iterations - use carefully! |
Value
An integer vector corresponding to unique id
s, giving group number
assignments.
For distant
, each sequential selection takes an id that has the largest
minimum distance from smooth TPS fits (<= 5 deg) of previous selections.
The distance of an id to a single TPS is the median absolute error across
the id time points. Distance of an id to a set of TPS is the minimum of
the individual distances. We pick the id that has the maximum of such
a minimum of medians.