StabilityScore {sharp} | R Documentation |
Stability score
Description
Computes the stability score from selection proportions of models with a given parameter controlling the sparsity and for different thresholds in selection proportions. The score measures how unlikely it is that the selection procedure is uniform (i.e. uninformative) for a given combination of parameters.
Usage
StabilityScore(
selprop,
pi_list = seq(0.6, 0.9, by = 0.01),
K,
n_cat = 3,
group = NULL
)
Arguments
selprop |
array of selection proportions. |
pi_list |
vector of thresholds in selection proportions. If
|
K |
number of resampling iterations. |
n_cat |
computation options for the stability score. Default is
|
group |
vector encoding the grouping structure among predictors. This argument indicates the number of variables in each group and only needs to be provided for group (but not sparse group) penalisation. |
Details
The stability score is derived from the likelihood under the assumption of uniform (uninformative) selection.
We classify the features into three categories: the stably selected ones
(that have selection proportions \ge \pi
), the stably excluded ones
(selection proportion \le 1-\pi
), and the unstable ones (selection
proportions between 1-\pi
and \pi
).
Under the hypothesis of equiprobability of selection (instability), the likelihood of observing stably selected, stably excluded and unstable features can be expressed as:
L_{\lambda, \pi} = \prod_{j=1}^N [ ( 1 - F( K \pi - 1 ) )^{1_{H_{\lambda} (j) \ge K \pi}}
\times ( F( K \pi - 1 ) - F( K ( 1 - \pi ) )^{1_{ (1-\pi) K < H_{\lambda} (j) < K \pi }}
\times F( K ( 1 - \pi ) )^{1_{ H_{\lambda} (j) \le K (1-\pi) }} ]
where H_{\lambda} (j)
is the selection count of feature j
and
F(x)
is the cumulative probability function of the binomial
distribution with parameters K
and the average proportion of selected
features over resampling iterations.
The stability score is computed as the minus log-transformed likelihood under the assumption of equiprobability of selection:
S_{\lambda, \pi} = -log(L_{\lambda, \pi})
The stability score increases with stability.
Alternatively, the stability score can be computed by considering only two
sets of features: stably selected (selection proportions \ge \pi
) or
not (selection proportions < \pi
). This can be done using
n_cat=2
.
Value
A vector of stability scores obtained with the different thresholds in selection proportions.
References
Bodinier B, Filippi S, Nøst TH, Chiquet J, Chadeau-Hyam M (2023). “Automated calibration for stability selection in penalised regression and graphical models.” Journal of the Royal Statistical Society Series C: Applied Statistics, qlad058. ISSN 0035-9254, doi:10.1093/jrsssc/qlad058, https://academic.oup.com/jrsssc/advance-article-pdf/doi/10.1093/jrsssc/qlad058/50878777/qlad058.pdf.
See Also
Other stability metric functions:
ConsensusScore()
,
FDP()
,
PFER()
,
StabilityMetrics()
Examples
# Simulating set of selection proportions
set.seed(1)
selprop <- round(runif(n = 20), digits = 2)
# Computing stability scores for different thresholds
score <- StabilityScore(selprop, pi_list = c(0.6, 0.7, 0.8), K = 100)