estimateSigma {selectiveInference} | R Documentation |
Estimate the noise standard deviation in regression
Description
Estimates the standard deviation of the noise, for use in the selectiveInference package
Usage
estimateSigma(x, y, intercept=TRUE, standardize=TRUE)
Arguments
x |
Matrix of predictors (n by p) |
y |
Vector of outcomes (length n) |
intercept |
Should glmnet be run with an intercept? Default is TRUE |
standardize |
Should glmnet be run with standardized predictors? Default is TRUE |
Details
This function estimates the standard deviation of the noise, in a linear regresion setting.
A lasso regression is fit, using cross-validation to estimate the tuning parameter lambda.
With sample size n, yhat equal to the predicted values and df being the number of nonzero
coefficients from the lasso fit, the estimate of sigma is sqrt(sum((y-yhat)^2) / (n-df-1))
.
Important: if you are using glmnet to compute the lasso estimate, be sure to use the settings
for the "intercept" and "standardize" arguments in glmnet and estimateSigma. Same applies to fs
or lar, where the argument for standardization is called "normalize".
Value
sigmahat |
The estimate of sigma |
df |
The degrees of freedom of lasso fit used |
Author(s)
Ryan Tibshirani, Rob Tibshirani, Jonathan Taylor, Joshua Loftus, Stephen Reid
References
Stephen Reid, Jerome Friedman, and Rob Tibshirani (2014). A study of error variance estimation in lasso regression. arXiv:1311.5274.
Examples
set.seed(33)
n = 50
p = 10
sigma = 1
x = matrix(rnorm(n*p),n,p)
beta = c(3,2,rep(0,p-2))
y = x%*%beta + sigma*rnorm(n)
# run forward stepwise
fsfit = fs(x,y)
# estimate sigma
sigmahat = estimateSigma(x,y)$sigmahat
# run sequential inference with estimated sigma
out = fsInf(fsfit,sigma=sigmahat)
out