difconet.build.controlled.dataset {difconet} | R Documentation |
GENERATES A DATASET CONTROLLING FOR NOISE AND GENES CONNECTED IN NETWORKS
Description
This function takes a normal dataset and generate simulated tumor stages by adding progressive levels of noise. It may add artificial networks of genes connected at given correlations that can progressively increase or decrease their level of correlation.
Usage
difconet.build.controlled.dataset(data,
noise.genes = round(nrow(data)*0.1),
noise.sigma = c(0.0, 0.1, 0.2),
nonoise.sigma = c(0.0, 0.01, 0.01),
netcov = matrix(c(
0.90, 0.90, 0.75, 0.75, 0.60, 0.60, 0.45, 0.45, 0.30, 0.30,
0.15, 0.15, 0.30, 0.30, 0.45, 0.45, 0.60, 0.60, 0.75, 0.75,
0.95, 0.95, 0.80, 0.80, 0.65, 0.65, 0.50, 0.50, 0.35, 0.35,
0.10, 0.10, 0.25, 0.25, 0.40, 0.40, 0.55, 0.55, 0.70, 0.70,
1.00, 1.00, 0.85, 0.85, 0.70, 0.70, 0.55, 0.55, 0.40, 0.40,
0.05, 0.05, 0.20, 0.20, 0.35, 0.35, 0.50, 0.50, 0.65, 0.65
), ncol=3),
genes.nets = 10,
corfunc=function(a,b) cor(a,b,method="spearman"),
verbose = TRUE)
Arguments
data |
data.frame or matrix representing the normal dataset. Rows are genes and columns are samples. |
noise.genes |
the number of genes from data that will noised. |
noise.sigma |
Levels of gaussian noise to be added (at zero mean) expressed in a cumulative manner. |
nonoise.sigma |
Levels of gaussian noise to be added (at zero mean) for the rest of the genes. |
netcov |
numeric matrix of correlation levels for networks, rows represent networks and columns represent stages. |
genes.nets |
The number of genes in each generated network. |
corfunc |
Correlation method used. |
verbose |
Print progress. |
Details
This function generates a simulated tumor progression dataset based on normal data. The progression is done by stages. The number of stages is given by the length of noise.sigma. Each stage will have the same dimensions than data (plus the networks). The stages will be N, T1, T2, and so on. The N is meant to be the data itself with no noise but for generality, the first element of noise.sigma specifies the level of noise for N (default to 0). The next values of noise.sigma will be used to generate T1, T2, and so on. Thus the returned data will be estimated by N=data+noise.sigma[1], T1=N+noise.sigma[2], T2=T1+noise.sigma[3], and so on. Note that noise.sigma will be added only to a specific number of rows given by noise.genes. The value returned is a list of the generated matrices. In top of that, the nonoise.sigma specify the level of noise added to those genes not selected to be noised. This is meant to be lower levels of noise than noise.sigma to avoid that data in stages is just a copy of previous data. This function also adds full connected networks of genes connected at netcov levels. The data added has mean=0 and sd=1. The number of rows represent the networks added. The columns represent the stages.
Value
List of stages.
Author(s)
Elpidio Gonzalez and Victor Trevino vtrevino@itesm.mx
References
Gonzalez-Valbuena and Trevino 2017 Metrics to Estimate Differential Co-Expression Networks Journal Pending volume 00–10
See Also
difconet.noise.inspection
.
difconet.run
.
Examples
## Not run: difconet.noise.inspection(normaldata, tumordata, sigma=0:15/10)