GEInfo {GEInfo}R Documentation

GEInfo approach with fixed tunings

Description

Realize to estimate the GEInfo approach at fixed tunings. It is available for Linear, Logistic, and Poisson regressions.

Usage

GEInfo(
  E,
  G,
  Y,
  family,
  S_G,
  S_GE,
  kappa1,
  kappa2,
  lam1,
  lam2,
  tau,
  xi = 6,
  epsilon = 0,
  max.it = 500,
  thresh = 0.001,
  Type_Y = NULL
)

Arguments

E

Observed matrix of E variables, of dimensions n x q.

G

Observed matrix of G variables, of dimensions n x p.

Y

Response variable, of length n. Quantitative for family="gaussian", or family="poisson" (non-negative counts). For family="binomial" should be a factor with two levels.

family

Model type: one of ("gaussian", "binomial", "poisson").

S_G

A user supplied vector, denoting the subscript of G variables which have prior information.

S_GE

A user supplied matrix, denoting the subscript of G-E interactions which have prior information. The first and second columns of S_GE represent the subscript of G variable and the subscript of E variable, respectively. For example, S_GE = matrix( c(1, 2), ncol = 2), which indicates that the 1st G variable and the 2nd E variable have an interaction effect on Y.

kappa1

A user supplied kappa1.

kappa2

A user supplied kappa2.

lam1

A user supplied lambda1.

lam2

A user supplied lambda2.

tau

A user supplied tau.

xi

Tuning parameter of MCP penalty. Default is 6.

epsilon

Tuning parameter of Ridge penalty which shrinks on the coefficients having prior information. Default is 0.

max.it

Maximum number of iterations (total across entire path). Default is 500.

thresh

Convergence threshold for group coordinate descent algorithm. The algorithm iterates until the change for each coefficient is less than thresh. Default is 1e-3.

Type_Y

A vector of Type_Y prior information, having the same length with Y. Default is NULL. For family="gaussian", Type_Y is continuous. For family="binomial", Type_Y is binary. For family="poisson", Type_Y is a count vector. If users supply a Type_Y prior information, the function will use it to estimate a GEInfo model. If Type_Y=NULL, the function will incorporate the Type_S prior information S_G and S_GE to realize a GEInfo model.

Details

The function contains five tuning parameters, namely kappa1, kappa2, lambda1, lambda2, and tau. kappa1 and kappa2 are used to estimate model and select variables. lambda1 and lambda2 are used to calculate the prior-predicted response based on S_G and S_GE. tau is used for balancing between the observed response Y and the prior-predicted response.

Value

An object of class "GEInfo" is returned, which is a list with the ingredients of the cross-validation fit.

a

Coefficient vector of length q for E variables.

b

Coefficient vector of length (q+1)p for W (G variables and G-E interactions).

beta

Coefficient vector of length p for G variables.

gamma

Coefficient matrix of dimensions p*q for G-E interactions.

alpha

Intercept.

coef

A coefficient vector of length (q+1)*(p+1), including the estimates for \alpha (intercept), a (coefficients for all E variables), and b (coefficients for all G variables and G-E interactions).

References

Wang X, Xu Y, and Ma S. (2019). Identifying gene-environment interactions incorporating prior information. Statistics in medicine, 38(9): 1620-1633. doi: 10.1002/sim.8064

Examples

n <- 30; p <- 4; q <- 2
E <- MASS::mvrnorm(n, rep(0,q), diag(q))
G <- MASS::mvrnorm(n, rep(0,p), diag(p))
W <- matW(E, G)
alpha <- 0; a <- seq(0.4, 0.6, length=q);
beta <- c(seq(0.2, 0.5, length=2), rep(0, p-2))
vector.gamma <- c(0.8, 0.9, 0, 0)
gamma <- matrix(c(vector.gamma, rep(0, p*q - length(vector.gamma))), nrow=p, byrow=TRUE)
mat.b.gamma <- cbind(beta, gamma)
b <- as.vector(t(mat.b.gamma))              # coefficients of G and GE
Y <- alpha + E %*% a + W %*% b + rnorm (n, 0, 0.5)
S_G <- c(1)
S_GE <- cbind(c(1), c(1))
fit3 <- GEInfo(E, G, Y, family='gaussian', S_G=S_G,
S_GE=S_GE,kappa1 = 0.2,kappa2=0.2,lam1=0.2,lam2=0.2,tau=0.5)

[Package GEInfo version 1.0 Index]