larInf {selectiveInference} | R Documentation |
Selective inference for least angle regression
Description
Computes p-values and confidence intervals for least angle regression
Usage
larInf(obj, sigma=NULL, alpha=0.1, k=NULL, type=c("active","all","aic"),
gridrange=c(-100,100), bits=NULL, mult=2, ntimes=2, verbose=FALSE)
Arguments
obj |
Object returned by |
sigma |
Estimate of error standard deviation. If NULL (default), this is estimated
using the mean squared residual of the full least squares fit when n >= 2p, and
using the standard deviation of y when n < 2p. In the latter case, the user
should use |
alpha |
Significance level for confidence intervals (target is miscoverage alpha/2 in each tail) |
k |
See "type" argument below. Default is NULL, in which case k is taken to be the the number of steps computed in the least angle regression path |
type |
Type of analysis desired: with "active" (default), p-values and confidence intervals are computed for each predictor as it is entered into the active step, all the way through k steps; with "all", p-values and confidence intervals are computed for all variables in the active model after k steps; with "aic", the number of steps k is first estimated using a modified AIC criterion, and then the same type of analysis as in "all" is carried out for this particular value of k. Note that the AIC scheme is defined to choose a number of steps k after which the AIC criterion
increases |
gridrange |
Grid range for constructing confidence intervals, on the standardized scale |
bits |
Number of bits to be used for p-value and confidence interval calculations. Default is
NULL, in which case standard floating point calculations are performed. When not NULL,
multiple precision floating point calculations are performed with the specified number
of bits, using the R package |
mult |
Multiplier for the AIC-style penalty. Hence a value of 2 (default) gives AIC, whereas a value of log(n) would give BIC |
ntimes |
Number of steps for which AIC-style criterion has to increase before minimizing point is declared |
verbose |
Print out progress along the way? Default is FALSE |
Details
This function computes selective p-values and confidence intervals (selection intervals)
for least angle regression. The default is to report the results for
each predictor after its entry into the model. See the "type" argument for other options.
The confidence interval construction involves numerical search and can be fragile:
if the observed statistic is too close to either end of the truncation interval
(vlo and vup, see references), then one or possibly both endpoints of the interval of
desired coverage cannot be computed, and default to +/- Inf. The output tailarea
gives the achieved Gaussian tail areas for the reported intervals—these should be close
to alpha/2, and can be used for error-checking purposes.
Value
type |
Type of analysis (active, all, or aic) |
k |
Value of k specified in call |
khat |
When type is "active", this is an estimated stopping point
declared by |
pv |
P-values for active variables |
ci |
Confidence intervals |
tailarea |
Realized tail areas (lower and upper) for each confidence interval |
vlo |
Lower truncation limits for statistics |
vup |
Upper truncation limits for statistics |
vmat |
Linear contrasts that define the observed statistics |
y |
Vector of outcomes |
pv.spacing |
P-values from the spacing test (here M+ is used) |
pv.modspac |
P-values from the modified form of the spacing test (here M+ is replaced by the next knot) |
pv.covtest |
P-values from covariance test |
vars |
Variables in active set |
sign |
Signs of active coefficients |
alpha |
Desired coverage (alpha/2 in each tail) |
sigma |
Value of error standard deviation (sigma) used |
call |
The call to larInf |
Author(s)
Ryan Tibshirani, Rob Tibshirani, Jonathan Taylor, Joshua Loftus, Stephen Reid
References
Ryan Tibshirani, Jonathan Taylor, Richard Lockhart, and Rob Tibshirani (2014). Exact post-selection inference for sequential regression procedures. arXiv:1401.3889.
See Also
Examples
set.seed(43)
n = 50
p = 10
sigma = 1
x = matrix(rnorm(n*p),n,p)
beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)
# run LAR
larfit = lar(x,y)
# compute sequential p-values and confidence intervals
# (sigma estimated from full model)
out.seq = larInf(larfit)
out.seq
# compute p-values and confidence intervals after AIC stopping
out.aic = larInf(larfit,type="aic")
out.aic
# compute p-values and confidence intervals after 5 fixed steps
out.fix = larInf(larfit,type="all",k=5)
out.fix