evaluate_partition_unsup {doMIsaul}R Documentation

Comparison of an unsupervised obtained partition to a reference partition.

Description

Compares partitions on number of cluster, ARI and percentage of unclassified observations.

Usage

evaluate_partition_unsup(
  partition,
  partition.ref,
  is.missing = NULL,
  is.cens = NULL
)

Arguments

partition

vector (factor): the partition to evaluate.

partition.ref

reference partition 1 (ex partition on complete data or true partition if known).

is.missing

boolean vector identifying observations with missing data (coded TRUE), from those without (coded FALSE).

is.cens

the incomplete dataframe with NA for missing and left-censored data (or the complete datasets if all data were observed).

Value

A list containing the following elements : Nbclust: number of clusters of the partition; ARI: ARI value on cases classified by both partitions ; ARI.cc: ARI value on cases complete AND classified by both partitions ; ARI.nona; ARI on cases with no missing data AND classified by both partitions; ARI.nocens: ARI on cases with no censored data AND classified by both partitions ; Per.Unclass: Percentage of observations unclassified in the partition ; Per.Unclass.cc: Among complete cases, percentage of observations unclassified in the partition ; Per.Unclass.na: Among cases with missing data, percentage of observations unclassified in the partition ; Per.Unclass.cens: Among cases with censored data, percentage of observations unclassified in the partition ; Per.Unclass.ic: Among incomplete cases, percentage of observations unclassified in the partition

Examples

res <- evaluate_partition_unsup(
  partition = factor(rep(c(1,2,3), each = 50)),
  partition.ref = factor(rep(c(1,2,3), times = c(100, 25, 25))))

## With missing data
res2 <- evaluate_partition_unsup(
  partition = factor(rep(c(1,2,3), each = 50)),
  partition.ref = factor(rep(c(1,2,3), times = c(100, 25, 25))),
  is.missing = sample(c(TRUE, FALSE), 150, replace = TRUE, prob = c(.2,.8)))

## With missing and censored data
  missing.indicator <- sample(c(TRUE, FALSE), 150,
   replace = TRUE, prob = c(.2,.8))
  Censor.indicator <- data.frame(
   X1 = runif(150, 1, 5),
   X2 = runif(150, 6, 8),
   X3 = runif(150, 3, 9))
  Censor.indicator$X1[missing.indicator] <- NA
  Censor.indicator$X1[
  sample(c(TRUE, FALSE), 150,replace = TRUE, prob = c(.1,.9))] <- NA
  Censor.indicator$X2[
  sample(c(TRUE, FALSE), 150,replace = TRUE, prob = c(.3,.7))] <- NA
  Censor.indicator$X3[
  sample(c(TRUE, FALSE), 150,replace = TRUE, prob = c(.05,.95))] <- NA
res3 <- evaluate_partition_unsup(
  partition = factor(rep(c(1,2,3), each = 50)),
  partition.ref = factor(rep(c(1,2,3), times = c(100, 25, 25))),
  is.missing = missing.indicator,
  is.cens = Censor.indicator)

## With missing and censored data and unclassifed observations
res4 <- evaluate_partition_unsup(
  partition = factor(rep(c(1,2, NA,3), times = c(50, 40, 20, 40))),
  partition.ref = factor(rep(c(1,2,3), times = c(100, 25, 25))),
  is.missing = missing.indicator,
  is.cens = Censor.indicator)

[Package doMIsaul version 1.0.1 Index]