clust_proto_random {representr}R Documentation

Prototype record from a cluster.

Description

Prototype record from a cluster.

Usage

clust_proto_random(
  cluster,
  prob = rep(1/nrow(cluster), nrow(cluster)),
  id = TRUE
)

clust_proto_minimax(cluster, not_cluster, distance, id = TRUE, ...)

maxmin_compare(ties, not_cluster, distance, ...)

within_category_compare(ties, not_cluster, distance, ...)

random_compare(ties, not_cluster, distance, ...)

Arguments

cluster

A data frame of the clustered records.

prob

A vector of length nrow(cluster) that sums to 1, giving the probability of selection.

id

Logical indicator to return id of record selected (TRUE) or actual record (FALSE). Note, if returning id, must have original row numbers as rownames in each cluster.

not_cluster

A data frame of the records outside the cluster

distance

A distance function for comparing records

...

Additional arguments passed to the comparison function

ties

A data frame of the records that are tied

Value

If id = FALSE, returns the prototype record from an individual cluster. Otherwise, returns the record id of the prototype record for that cluster. If there is a tie in the minimax prototype method, then random selection is used to break the tie.

Examples

data("rl_reg1")

clusters <- split(rl_reg1, identity.rl_reg1)
clust_proto_random(clusters[[1]])


not_clusters <- lapply(seq_along(clusters), function(x){
if(nrow(clusters[[x]]) > 1)
  do.call(rbind, clusters[-x])
})
clust_proto_minimax(clusters[[1]], not_clusters[[1]], dist_binary)


[Package representr version 0.1.5 Index]