SelectionPerformance {sharp}R Documentation

Selection performance

Description

Computes different metrics of selection performance by comparing the set of selected features to the set of true predictors/edges. This function can only be used in simulation studies (i.e. when the true model is known).

Usage

SelectionPerformance(theta, theta_star, pk = NULL, cor = NULL, thr = 0.5)

Arguments

theta

output from VariableSelection, BiSelection, or GraphicalModel. Alternatively, it can be a binary matrix of selected variables (in variable selection) or a binary adjacency matrix (in graphical modelling)

theta_star

output from SimulateRegression, SimulateComponents, or SimulateGraphical. Alternatively, it can be a binary matrix of true predictors (in variable selection) or the true binary adjacency matrix (in graphical modelling).

pk

optional vector encoding the grouping structure. Only used for multi-block stability selection where pk indicates the number of variables in each group. If pk=NULL, single-block stability selection is performed.

cor

optional correlation matrix. Only used in graphical modelling.

thr

optional threshold in correlation. Only used in graphical modelling and when argument "cor" is not NULL.

Value

A matrix of selection metrics including:

TP

number of True Positives (TP)

FN

number of False Negatives (TN)

FP

number of False Positives (FP)

TN

number of True Negatives (TN)

sensitivity

sensitivity, i.e. TP/(TP+FN)

specificity

specificity, i.e. TN/(TN+FP)

accuracy

accuracy, i.e. (TP+TN)/(TP+TN+FP+FN)

precision

precision (p), i.e. TP/(TP+FP)

recall

recall (r), i.e. TP/(TP+FN)

F1_score

F1-score, i.e. 2*p*r/(p+r)

If argument "cor" is provided, the number of False Positives among correlated (FP_c) and uncorrelated (FP_i) pairs, defined as having correlations (provided in "cor") above or below the threshold "thr", are also reported.

Block-specific performances are reported if "pk" is not NULL. In this case, the first row of the matrix corresponds to the overall performances, and subsequent rows correspond to each of the blocks. The order of the blocks is defined as in BlockStructure.

See Also

Other functions for model performance: ClusteringPerformance(), SelectionPerformanceGraph()

Examples


# Variable selection model
set.seed(1)
simul <- SimulateRegression(pk = 30, nu_xy = 0.5)
stab <- VariableSelection(xdata = simul$xdata, ydata = simul$ydata)

# Selection performance
SelectionPerformance(theta = stab, theta_star = simul)

# Alternative formulation
SelectionPerformance(
  theta = SelectedVariables(stab),
  theta_star = simul$theta
)



[Package sharp version 1.4.6 Index]