netphenogeno {netgwas} | R Documentation |
Reconstructs conditional dependence network among genetic loci and phenotypes
Description
This is one of the main functions of the netgwas package. This function reconstructs a conditional independence network between genotypes and phenotypes for diploids and polyploids. Three methods are available to reconstruct networks, namely (i) Gibbs sampling, (ii) approximation method, and (iii) nonparanormal approach within the Gaussian copula graphical model. The first two methods are able to deal with missing genotypes. The last one is computationally faster.
Usage
netphenogeno(data, method = "gibbs", rho = NULL, n.rho = NULL, rho.ratio = NULL,
ncores = 1, em.iter = 5, em.tol=.001, verbose = TRUE)
Arguments
data |
An ( |
method |
Reconstructing both genotype-phenotype interactions network and genotype-phenotype-environment interactions network with three methods: "gibbs", "approx", and "npn". For a medium (~500) and a large number of variables we recommend to choose "gibbs" and "approx", respectively. Choosing "npn" for a very large number of variables (> 2000) is computationally efficient. The default method is "gibbs". |
rho |
A decreasing sequence of non-negative numbers that control the sparsity level. Leaving the input as |
n.rho |
The number of regularization parameters. The default value is |
rho.ratio |
Determines distance between the elements of |
ncores |
The number of cores to use for the calculations. Using |
em.iter |
The number of EM iterations. The default value is 5. |
em.tol |
A criteria to stop the EM iterations. The default value is .001. |
verbose |
Providing a detail message for tracing output. The default value is |
Details
This function reconstructs both genotype-phenotype network and genotype-phenotype-environment interactions network. In genotype-phenotype networks nodes are either markers or phenotypes; each phenotype is connected by an edge to a marker if there is a direct association between them given the rest of the variables. Different phenotypes may also interconnect. In addition to markers and phenotypes information, the input data can include environmental variables. Then, the interactions network shows the conditional dependence relationships between markers, phenotypes and environmental factors.
Value
An object with S3 class "netgwas"
is returned:
Theta |
A list of estimated p by p precision matrices that show the conditional independence relationships patterns among measured items. |
path |
A list of estimated p by p adjacency matrices. This is the graph path corresponding to |
ES |
A list of estimated p by p conditional expectation corresponding to |
Z |
A list of n by p transformed data based on Gaussian copula. |
rho |
A |
loglik |
A |
data |
The |
Note
This function estimates a graph path . To select an optimal graph please refer to selectnet
.
Author(s)
Pariya Behrouzi and Ernst C. Wit
Maintainers: Pariya Behrouzi pariya.behrouzi@gmail.com
References
1. Behrouzi, P., and Wit, E. C. (2019). Detecting epistatic selection with partially observed genotype data by using copula graphical models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(1), 141-160.
2. Behrouzi, P., and Wit, E. C. (2017c). netgwas: An R Package for Network-Based Genome-Wide Association Studies. arXiv preprint, arXiv:1710.01236.
3. D. Witten and J. Friedman. New insights and faster computations for the graphical lasso. Journal of Computational and Graphical Statistics, to appear, 2011.
4. Guo, Jian, et al. "Graphical models for ordinal data." Journal of Computational and Graphical Statistics 24.1 (2015): 183-204.
See Also
Examples
data(thaliana)
head(thaliana, n=3)
#Construct a path for genotype-phenotype interactions network in thaliana data
res <- netphenogeno(data = thaliana); res
plot(res)
#Select an optimal network
sel <- selectnet(res)
#Plot selected network and the conditional correlation (CI) relationships
plot(sel, vis="CI")
plot(sel, vis="CI", n.mem = c(8, 56, 31, 33, 31, 30), w.btw =50, w.within= 1)
#Visualize interactive plot for the selected network
#Color "red" for 8 phenotypes, and different colors for each chromosome.
cl <- c(rep("red", 8), rep("white",56), rep("tan1",31),
rep("gray",33), rep("lightblue2",31), rep("salmon2",30))
#The IDs of phenotypes and SNPs to be shown in the network
id <- c("DTF_LD","CLN_LD","RLN_LD","TLN_LD","DTF_SD","CLN_SD","RLN_SD",
"TLN_SD","snp15","snp16","snp17","snp49","snp50","snp60","snp75",
"snp76","snp81","snp83","snp84","snp86","snp82", "snp113","snp150",
"snp155","snp159","snp156","snp161","snp158","snp160","snp162","snp181")
plot(sel, vis="interactive", n.mem = c(8, 56, 31, 33, 31, 30), vertex.color= cl,
label.vertex= "some", sel.nod.label= id, edge.color= "gray", w.btw= 50,
w.within= 1)
#Partial correlations between genotypes and phenotypes in the thaliana dataset.
library(Matrix)
image(sel$par.cor, xlab="geno-pheno", ylab="geno-pheno", sub="")