simgeno {netgwas} | R Documentation |
Generate genotype data based on Gaussian copula
Description
Generating discrete ordinal data based on underlying "genome-like" graph structure. The procedure of simulating data relies on a continues variable, which can be simulated from either multivariate normal distribution, or multivariate t-distribution with d
degrees of freedom.
Usage
simgeno( p = 90, n = 200, k = NULL, g = NULL, adjacent = NULL, alpha =
NULL , beta = NULL, con.dist = "Mnorm", d = NULL, vis = TRUE)
Arguments
p |
The number of variables. The default value is 90. |
n |
The number of sample size (observations). The default value is 200. |
k |
The number of states (categories). The default value is 3. |
g |
The number of groups (chromosomes) in the graph. The default value is about |
adjacent |
The number of adjacent variable(s) to be linked to a variable. For example, if |
alpha |
A probability that a pair of non-adjacent variables in the same group is given an edge. The default value is 0.01. |
beta |
A probability that variables in different groups are linked with an edge. The default value is 0.02. |
con.dist |
The distribution of underlying continuous variable. If |
d |
The degrees of freedom of the continuous variable, only applicable when
|
vis |
Visualize the graph pattern and the adjacency matrix of the true graph structure. The default value is TRUE. |
Details
The graph pattern is generated as below:
genome-like: p
variables are evenly partitions variables into g
disjoint groups; the adjacent variables within each group are linked via an edge. With a probability alpha
a pair of non-adjacent variables in the same group is given an edge. Variables in different groups are linked with an edge with a probability of beta
.
Value
An object with S3 class "simgeno" is returned:
data |
The generated data as an |
Theta |
A |
adj |
A |
Sigma |
A |
n.groups |
The number of groups. |
groups |
A vector that indicates each variable belongs to which group. |
sparsity |
The sparsity levels of the true graph. |
Author(s)
Pariya Behrouzi and Ernst C. Wit
Maintainer: Pariya Behrouzi <pariya.behrouzi@gmail.com>
References
1. Behrouzi, P., Arends, D., and Wit, E. C. (2023). netgwas: An R Package for Network-Based Genome-Wide Association Studies. The R journal, 14(4), 18-37.
2. Behrouzi, P., and Wit, E. C. (2019). Detecting epistatic selection with partially observed genotype data by using copula graphical models. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(1), 141-160.
See Also
netsnp
, and netgwas-package
Examples
#genome-like graph structure
sim1 <- simgeno(alpha = 0.01, beta = 0.02)
plot(sim1)
#genome-like graph structure with more edges between variables in a same or different groups
sim2 <- simgeno(adjacent = 3, alpha = 0.02 , beta = 0.03)
plot(sim2)
#simulate data
D <- simgeno(p=50, n=100, g=5, k= 3, adjacent = 3, alpha = 0.06 , beta = 0.08)
plot(D)
#Reconstructing intra- and inter-chromosomal conditional interactions (LD) network
out <- netsnp(data = D$data, n.rho= 4, ncores= 1)
plot(out)
#Select an optimal graph
sel <- selectnet(out)
plot(sel, vis= "CI" )