R: Estimation of precision matrix and detection of graphical...

boost.graph {GUEST}

R Documentation

Estimation of precision matrix and detection of graphical structure

Description

This function first applies the regression calibration to deal with measurement error effects. After that, the feature screening technique is employed to screen out independent pairs of random variables and reduce the dimension of random variables. Finally, we adopt the boosting method to detect informative pairs of random variables and estimate the precision matrix. This function can handle various distributions, such as normal, binomial, and Poisson distributions, as well as nonlinear effects among random variables.

Usage

boost.graph(data,ite1,ite2,ite3,thre,select = 0.9,inc = 10^(-3),
sigma_e = 0.6,q = 0.8,lambda = 1,pi = 0.5,rep = 100,cor = TRUE)

Arguments

`data`	An n (observations) times p (variables) matrix of random variables, whose distributions can be continuous, discrete, or mixed.
`ite1`	The number of iterations for continuous variables.
`ite2`	The number of iterations for binary variables.
`ite3`	The number of iterations for count variables.
`thre`	The treshold value for feature screening, whose value should be between 0 and 1.
`select`	The treshold constant in the boosting algorithm, whose value should be between 0 and 1. The default value is 0.9.
`inc`	The learning rate of the increment in the boosting algorithm, which shoud be a small value. The default value is 0.001.
`sigma_e`	The common value in the diagonal covariance matrix of the error for the classical measurement error model when `data` are continuous. The default value is 0.6.
`q`	The common value used to characterize misclassification for binary random variables. The default value is 0.8.
`lambda`	The parameter of the Poisson distribution, which is used to characterize error-prone count random variables. The default value is 1.
`pi`	The probability in the Binomial distribution, which is used to characterize error-prone count random variables. The default value is 0.5.
`rep`	The number of bootstrapping iterations. The default value is 100.
`cor`	Measurement error correction when estimating the precision matrix. The default value is TRUE.

Value

`w`	The estimator of the precision matrix.
`p`	The chosen pairs obtained by the feature screening.
`xi`	The weights sorted with pairs in `p`.
`g`	The visualization of the estimated network structure determined by `w`.

Author(s)

Hui-Shan Tsao and Li-Pang Chen
Maintainer: Hui-Shan Tsao n410412@gmail.com

References

Hui-Shan Tsao (2024). Estimation of Ultrahigh-Dimensional Graphical Models and Its Application to Dsicriminant Analysis. Master Thesis supervised by Li-Pang Chen, National Chengchi University.

Examples

data(MedulloblastomaData)

X <- t(MedulloblastomaData[2:656,]) #covariates
Y <- MedulloblastomaData[1,] #response

X <- matrix(as.numeric(X),nrow=23)

p <- ncol(X)
n <- nrow(X)

#standarization
X_new=data.frame()
for (i in 1:p){
 X_new[1:n,i]=(X[,i]-rep(mean(X[,i]),n))/sd(X[,i])
}
X_new=matrix(unlist(X_new),nrow = n)


#estimate graphical model
result <- boost.graph(data = X_new, thre = 0.2, ite1 = 3, ite2 = 0, ite3 = 0, rep = 1)
theta.hat <- result$w

[Package GUEST version 0.2.0 Index]