selectnet {netgwas} | R Documentation |
Model selection
Description
Estimate the optimal regularization parameter at EM convergence based on different information criteria .
Usage
selectnet(netgwas.obj, opt.index= NULL, criteria= NULL, ebic.gamma=0.5,
ncores= NULL, verbose= TRUE)
Arguments
netgwas.obj |
An object with S3 class "netgwas" |
opt.index |
The program internally determines an optimal graph using |
criteria |
Model selection criteria. "ebic" and "aic" are available. BIC model selection can be calculated by fixing |
ebic.gamma |
The tuning parameter for ebic. The |
ncores |
The number of cores to use for the calculations. Using |
verbose |
If |
Details
This function computes extended Bayesian information criteria (ebic), Bayesian information criteria, Akaike information criterion (aic) at EM convergence based on observed or joint log-likelihood. The observed log-likelihood can be obtained through
\ell_Y(\widehat{\Theta}_\lambda) = Q(\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}) - H (\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}),
Where Q
can be calculated from netmap
, netsnp
, netphenogeno
function and H function is
H(\widehat{\Theta}_\lambda | \widehat{\Theta}^{(m)}_\lambda) = E_z[\ell_{Z | Y}(\widehat{\Theta}_\lambda) | Y; \widehat{\Theta}_\lambda] = E_z[\log f(z)| Y ;\widehat{\Theta}_\lambda ] - \log p(y).
The "ebic" and "aic" model selection criteria can be obtained as follow
ebic(\lambda) = -2 \ell(\widehat{\Theta}_\lambda) + ( \log n + 4 \gamma \log p) df(\lambda)
aic(\lambda) = -2 \ell(\widehat{\Theta}_\lambda) + 2 df(\lambda)
where df
refers to the number of non-zeros offdiagonal elements of \hat{\Theta}_\lambda
, and \gamma \in [0, 1]
. Typical value for for ebic.gamma
is 1/2, but it can also be tuned by experience. Fixing ebic.gamma = 0
results in bic model selection.
Value
An obj with S3 class "selectnet" is returned:
opt.adj |
The optimal graph selected from the graph path |
opt.theta |
The optimal precision matrix from the graph path |
opt.sigma |
The optimal covariance matrix from the graph path |
ebic.scores |
Extended BIC scores for regularization parameter selection at the EM convergence. Applicable if |
opt.index |
The index of optimal regularization parameter. |
opt.rho |
The selected regularization parameter. |
par.cor |
A partial correlation matrix. |
V.names |
Variables name whose are not isolated. |
and anything else that is included in the input netgwas.obj
.
Author(s)
Pariya Behrouzi and Ernst C.Wit
Maintainer: Pariya Behrouzi pariya.behrouzi@gmail.com
References
1. BBehrouzi, P., and Wit, E. C. (2019). Detecting epistatic selection with partially observed genotype data by using copula graphical models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(1), 141-160.
2. Behrouzi, P., Arends, D., and Wit, E. C. (2023). netgwas: An R Package for Network-Based Genome-Wide Association Studies. The R journal, 14(4), 18-37.
3. Ibrahim, Joseph G., Hongtu Zhu, and Niansheng Tang. (2012). Model selection criteria for missing-data problems using the EM algorithm. Journal of the American Statistical Association.
4. D. Witten and J. Friedman. (2011). New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear.
5. J. Friedman, T. Hastie and R. Tibshirani. (2007). Sparse inverse covariance estimation with the lasso, Biostatistics.
6. Foygel, R. and M. Drton. (2010). Extended bayesian information criteria for Gaussian graphical models. In Advances in Neural Information Processing Systems, pp. 604-612.
See Also
Examples
#simulate data
D <- simgeno(p=50, n=100, k= 3, adjacent = 3, alpha = 0.06 , beta = 0.06)
plot(D)
#explore intra- and inter-chromosomal interactions
out <- netsnp(D$data, n.rho= 5, ncores= 1)
plot(out)
#different graph selection methods
sel.ebic1 <- selectnet(out, criteria = "ebic")
plot(sel.ebic1, vis = "CI")
sel.aic <- selectnet(out, criteria = "aic")
plot(sel.aic, vis = "CI")
sel.bic <- selectnet(out, criteria = "ebic", ebic.gamma = 0)
plot(sel.bic, vis = "CI")