riPEER {mdpeer} | R Documentation |
Graph-constrained regression with penalty term being a linear combination of graph-based and ridge penalty terms
Description
Graph-constrained regression with penalty term being a linear combination of graph-based and ridge penalty terms.
See Details for model description and optimization problem formulation.
Usage
riPEER(Q, y, Z, X = NULL, optim.metod = "rootSolve",
rootSolve.x0 = c(1e-05, 1e-05), rootSolve.Q0.x0 = 1e-05, sbplx.x0 = c(1,
1), sbplx.lambda.lo = c(10^(-5), 10^(-5)), sbplx.lambda.up = c(1e+06,
1e+06), compute.boot.CI = FALSE, boot.R = 1000, boot.conf = 0.95,
boot.set.seed = TRUE, boot.parallel = "multicore", boot.ncpus = 4,
verbose = TRUE)
Arguments
Q |
graph-originated penalty matrix |
y |
response values matrix |
Z |
design matrix |
X |
design matrix |
optim.metod |
optimization method used to optimize
|
rootSolve.x0 |
vector containing initial guesses for |
rootSolve.Q0.x0 |
vector containing initial guess for |
sbplx.x0 |
vector containing initial guesses for |
sbplx.lambda.lo |
vector containing minimum values of |
sbplx.lambda.up |
vector containing mximum values of |
compute.boot.CI |
logical whether or not compute bootstrap confidence intervals for |
boot.R |
number of bootstrap replications used in bootstrap confidence intervals computation |
boot.conf |
confidence level assumed in bootstrap confidence intervals computation |
boot.set.seed |
logical whether or not set seed in bootstrap confidence intervals computation |
boot.parallel |
value of |
boot.ncpus |
value of |
verbose |
logical whether or not set verbose mode (print out function execution messages) |
Details
Estimates coefficients of linear model of the formula:
y = X\beta + Zb + \varepsilon
where:
-
y
- response, -
X
- data matrix, -
Z
- data matrix, -
\beta
- regression coefficients, not penalized in estimation process, -
b
- regression coefficients, penalized in estimation process and for whom there is, possibly a prior graph of similarity / graph of connections available.
The method uses a penalty being a linear combination of a graph-based and ridge penalty terms:
\beta_{est}, b_{est}= arg \; min_{\beta,b} \{ (y - X\beta - Zb)^T(y - X\beta - Zb) + \lambda_Qb^TQb + \lambda_Rb^Tb \}
where:
-
Q
- a graph-originated penalty matrix; typically: a graph Laplacian matrix, -
\lambda_Q
- regularization parameter for a graph-based penalty term -
\lambda_R
- regularization parameter for ridge penalty term
The two regularization parameters, \lambda_Q
and \lambda_R
, are estimated as ML estimators from equivalent
Linear Mixed Model optimizaton problem formulation (see: References).
Graph-originated penalty term allows imposing similarity between coefficients based on graph information given.
Ridge-originated penalty term facilitates parameters estimation: it reduces computational issues arising from singularity in a graph-originated penalty matrix and yields plausible results in situations when graph information is not informative.
Bootstrap confidence intervals computation is available (not set as a default option).
Value
b.est |
vector of |
beta.est |
vector of |
lambda.Q |
|
lambda.R |
|
lambda.2 |
|
boot.CI |
data frame with two columns, |
obj.fn.val |
optimization problem objective function value |
References
Karas, M., Brzyski, D., Dzemidzic, M., J., Kareken, D.A., Randolph, T.W., Harezlak, J. (2017). Brain connectivity-informed regularization methods for regression. doi: https://doi.org/10.1101/117945
Examples
set.seed(1234)
n <- 200
p1 <- 10
p2 <- 90
p <- p1 + p2
# Define graph adjacency matrix
A <- matrix(rep(0, p*p), nrow = p, ncol = p)
A[1:p1, 1:p1] <- 1
A[(p1+1):p, (p1+1):p] <- 1
L <- Adj2Lap(A)
# Define Q penalty matrix as graph Laplacian matrix normalized)
Q <- L2L.normalized(L)
# Define Z,X design matrices and aoutcome y
Z <- matrix(rnorm(n*p), nrow = n, ncol = p)
b.true <- c(rep(1, p1), rep(0, p2))
X <- matrix(rnorm(n*3), nrow = n, ncol = 3)
beta.true <- runif(3)
intercept <- 0
eta <- intercept + Z %*% b.true + X %*% beta.true
R2 <- 0.5
sd.eps <- sqrt(var(eta) * (1 - R2) / R2)
error <- rnorm(n, sd = sd.eps)
y <- eta + error
## Not run:
riPEER.out <- riPEER(Q, y, Z, X)
plt.df <- data.frame(x = 1:p, y = riPEER.out$b.est)
ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() + labs("b estimates")
## End(Not run)
## Not run:
# riPEER with 0.95 bootstrap confidence intervals computation
riPEER.out <- riPEER(Q, y, Z, X, compute.boot.CI = TRUE, boot.R = 500)
plt.df <- data.frame(x = 1:p,
y = riPEER.out$b.est,
lo = riPEER.out$boot.CI[,1],
up = riPEER.out$boot.CI[,2])
ggplot(plt.df, aes(x = x, y = y, group = 1)) + geom_line() +
geom_ribbon(aes(ymin=lo, ymax=up), alpha = 0.3)
## End(Not run)