stability {reverseR} | R Documentation |
Calculates stability values for results of 'lmInfl', 'lmMult' and 'lmThresh'
Description
This function calculates stability values for LOO (lmInfl
), LMO (lmMult
) and response value shifting/addition (lmThresh
).
Usage
stability(x, pval = FALSE, ...)
Arguments
x |
|
pval |
logical. If |
... |
other parameters, not yet implemented. |
Details
For results of lmInfl
:
A [0, 1]-bounded stability measure , with
= number of influencers (significance reversers) and
= total number of response values.
For results of lmMult
:
For each 1...max
, the percentage of all resamples that did *NOT* result in significance reversal.
For results of lmThresh
:
A [0, 1]-bounded stability measure , with
= number of response values where one of the ends of the significance region is within the prediction interval and
= total number of response values.
If pval = TRUE
, the exact p-value is calculated in the following manner:
1) Mean square error (MSE) and prediction standard error (se) are calculated from the linear model:
2) Upper and lower prediction intervals boundaries are calculated for each :
The prediction interval around is a scaled/shifted
-distribution with density function
, where is the density function of the central, unit-variance
-distribution.
3) The probability of either shifting the response value (if lmThresh(..., newobs = FALSE)
) or including a future response value (if
lmThresh(..., newobs = TRUE)
) to reverse the significance of the linear model is calculated as the integral between the end of the significance region (eosr) and the upper/lower prediction interval:
Value
The stability value.
Author(s)
Andrej-Nikolai Spiess
Examples
## See examples in 'lmInfl' and 'lmThresh'.
## The implemented strategy of calculating the
## probability of significance reversal, as explained above
## and compared to 'stabPlot'.
set.seed(125)
a <- 1:20
b <- 5 + 0.08 * a + rnorm(length(a), 0, 1)
LM1 <- lm(b ~ a)
res1 <- lmThresh(LM1, newobs = TRUE)
st1 <- stability(res1, pval = TRUE)
## Let's check that the prediction interval encompasses 95%:
dt.scaled <- function(x, df, mu, s) 1/s * dt((x - mu)/s, df)
integrate(dt.scaled, lower = st1$stats[1, "lower"], st1$stats[1, "upper"],
df = 18, mu = st1$stats[1, "fitted"], s = st1$stats[1, "se"])
## => 0.95 with absolute error < 8.4e-09
## This is the interval between "end of significance region" and upper
## prediction boundary:
integrate(dt.scaled, lower = st1$stats[1, "eosr.2"], st1$stats[1, "upper"],
df = 18, mu = st1$stats[1, "fitted"], s = st1$stats[1, "se"])
## => 0.09264124 with absolute error < 1e-15
## We can recheck this value by P(B) - P(A):
pt.scaled <- function(x, df, mu, s) pt((x - mu)/s, df)
pA <- pt.scaled(x = st1$stats[1, "eosr.2"], df = 18, mu = st1$stats[1, "fitted"],
s = st1$stats[1, "se"])
0.975 - pA
## => 0.09264124 as above