Ohit {Ohit} | R Documentation |
Fit a high-dimensional linear regression model via OGA+HDIC+Trim
Description
The first step is to sequentially select input variables via orthogonal greedy algorithm (OGA). The second step is to determine the number of OGA iterations using high-dimensional information criterion (HDIC). The third step is to remove irrelevant variables remaining in the second step using HDIC.
Usage
Ohit(X, y, Kn = NULL, c1 = 5, HDIC_Type = "HDBIC", c2 = 2, c3 = 2.01,
intercept = TRUE)
Arguments
X |
Input matrix of |
y |
Response vector of length |
Kn |
The number of OGA iterations. |
c1 |
The tuning parameter for the number of OGA iterations. Default is |
HDIC_Type |
High-dimensional information criterion. The value must be |
c2 |
The tuning parameter for |
c3 |
The tuning parameter for |
intercept |
Should an intercept be fitted? Default is |
Value
n |
The number of observations. |
p |
The number of input variables. |
Kn |
The number of OGA iterations. |
J_OGA |
The index set of Kn variables sequencially selected by OGA. |
HDIC |
The HDIC values along the OGA path. |
J_HDIC |
The index set of valuables determined by OGA+HDIC. |
J_Trim |
The index set of valuables determined by OGA+HDIC+Trim. |
betahat_HDIC |
The estimated regression coefficients of the model determined by OGA+HDIC. |
betahat_Trim |
The estimated regression coefficients of the model determined by OGA+HDIC+Trim. |
Author(s)
Hai-Tang Chiou, Ching-Kang Ing and Tze Leung Lai.
References
Ing, C.-K. and Lai, T. L. (2011). A stepwise regression method and consistent model selection for high-dimensional sparse linear models. Statistica Sinica, 21, 1473–1513.
Examples
# Example setup (Example 3 in Section 5 of Ing and Lai (2011))
n = 400
p = 4000
q = 10
beta_1q = c(3, 3.75, 4.5, 5.25, 6, 6.75, 7.5, 8.25, 9, 9.75)
b = sqrt(3/(4 * q))
x_relevant = matrix(rnorm(n * q), n, q)
d = matrix(rnorm(n * (p - q), 0, 0.5), n, p - q)
x_relevant_sum = apply(x_relevant, 1, sum)
x_irrelevant = apply(d, 2, function(a) a + b * x_relevant_sum)
X = cbind(x_relevant, x_irrelevant)
epsilon = rnorm(n)
y = as.vector((x_relevant %*% beta_1q) + epsilon)
# Fit a high-dimensional linear regression model via OGA+HDIC+Trim
Ohit(X, y, intercept = FALSE)