ClusteringPerformance {sharp}R Documentation

Clustering performance

Description

Computes different metrics of clustering performance by comparing true and predicted co-membership. This function can only be used in simulation studies (i.e. when the true cluster membership is known).

Usage

ClusteringPerformance(theta, theta_star, ...)

Arguments

theta

output from Clustering. Alternatively, it can be the estimated co-membership matrix (see CoMembership).

theta_star

output from SimulateClustering.Alternatively, it can be the true co-membership matrix (see CoMembership).

...

additional arguments to be passed to Clusters.

Value

A matrix of selection metrics including:

TP

number of True Positives (TP)

FN

number of False Negatives (TN)

FP

number of False Positives (FP)

TN

number of True Negatives (TN)

sensitivity

sensitivity, i.e. TP/(TP+FN)

specificity

specificity, i.e. TN/(TN+FP)

accuracy

accuracy, i.e. (TP+TN)/(TP+TN+FP+FN)

precision

precision (p), i.e. TP/(TP+FP)

recall

recall (r), i.e. TP/(TP+FN)

F1_score

F1-score, i.e. 2*p*r/(p+r)

rand

Rand Index, i.e. (TP+TN)/(TP+FP+TN+FN)

ari

Adjusted Rand Index (ARI), i.e. 2*(TP*TN-FP*FN)/((TP+FP)*(TN+FP)+(TP+FN)*(TN+FN))

jaccard

Jaccard index, i.e. TP/(TP+FP+FN)

See Also

Other functions for model performance: SelectionPerformance(), SelectionPerformanceGraph()

Examples


# Data simulation
set.seed(1)
simul <- SimulateClustering(
  n = c(30, 30, 30), nu_xc = 1
)
plot(simul)

# Consensus clustering
stab <- Clustering(
  xdata = simul$data, nc = seq_len(5)
)

# Clustering performance
ClusteringPerformance(stab, simul)

# Alternative formulation
ClusteringPerformance(
  theta = CoMembership(Clusters(stab)),
  theta_star = simul$theta
)



[Package sharp version 1.4.6 Index]