SIR_threshold_bootstrap {SIRthresholded} | R Documentation |
SIR optimally thresholded on bootstraped replications
Description
Apply a single-index optimally soft/hard thresholded with
slices on
'n_replications' bootstraped replications of
. The optimal number of
selected variables is the number of selected variables that came back most often
among the replications performed. From this, we can get the corresponding
and
that produce the same number of selected variables in the result of
'SIR_threshold_opt'.
Usage
SIR_threshold_bootstrap(
Y,
X,
H = 10,
thresholding = "hard",
n_replications = 50,
graph = TRUE,
output = TRUE,
n_lambda = 100,
k = 2,
choice = ""
)
Arguments
Y |
A numeric vector representing the dependent variable (a response vector). |
X |
A matrix representing the quantitative explanatory variables (bind by column). |
H |
The chosen number of slices (default is 10). |
thresholding |
The thresholding method to choose between hard and soft (default is hard). |
n_replications |
The number of bootstraped replications of (X,Y) done to estimate the model (default is 50). |
graph |
A boolean, set to TRUE to plot graphs (default is TRUE). |
output |
A boolean, set to TRUE to print information (default is TRUE). |
n_lambda |
The number of lambda to test. The n_lambda tested lambdas are uniformally distributed between 0 and the maximum value of the interest matrix (default is 100). |
k |
Multiplication factor of the bootstrapped sample size (default is 1 = keep the same size as original data). |
choice |
the graph to plot:
|
Value
An object of class SIR_threshold_bootstrap, with attributes:
b |
This is the optimal estimated EDR direction, which is the principal eigenvector of the interest matrix. |
lambda_opt |
The optimal lambda. |
vec_nb_var_selec |
Vector that contains the number of selected variables for each replications. |
occurrences_var |
Vector that contains at index i the number of times the i_th variable has been selected in a replication. |
call |
Unevaluated call to the function. |
nb_var_selec_opt |
Optimal number of selected variables which is the number of selected variables that came back most often among the replications performed. |
list_relevant_variables |
A list that contains the variables selected by the model. |
n |
Sample size. |
p |
The number of variables in X. |
H |
The chosen number of slices. |
n_replications |
The number of bootstraped replications of (X,Y) done to estimate the model. |
thresholding |
The thresholding method used. |
X_reduced |
The X data restricted to the variables selected by the model. It can be used to estimate a new SIR model on the relevant variables to improve the estimation of b. |
mat_b |
Contains the estimation b at each bootstraped replications. |
lambdas_opt_boot |
Contains the optimal lambda found by SIR_threshold_opt at each replication. |
index_pred |
The index Xb' estimated by SIR. |
Y |
The response vector. |
M1 |
The interest matrix thresholded with the optimal lambda. |
Examples
# Generate Data
set.seed(8)
n <- 170
beta <- c(1,1,1,1,1,rep(0,15))
X <- mvtnorm::rmvnorm(n,sigma=diag(1,20))
eps <- rnorm(n,sd=8)
Y <- (X%*%beta)**3+eps
# Apply SIR with hard thresholding
SIR_threshold_bootstrap(Y,X,H=10,n_lambda=300,thresholding="hard", n_replications=30,k=2)