simulate_tables {FunChisq} | R Documentation |
Simulate Noisy Contingency Tables to Represent Diverse Discrete Patterns
Description
Generate random contingency tables representing various functional, non-functional, dependent, or independent patterns, without specifying a parametric model for the patterns.
Usage
simulate_tables(
n = 100, nrow = 3, ncol = 3,
type = c("functional", "many.to.one",
"discontinuous", "independent",
"dependent.non.functional"),
n.tables = 1,
row.marginal = NULL,
col.marginal = NULL,
noise = 0.0, noise.model = c("house", "candle"),
margin = 0
)
Arguments
n |
a positive integer specifying the sample size to be distributed in each table. For |
nrow |
a positive integer specifying the number of rows in each table. The value must be no less than 2. For |
ncol |
a positive integer specifying the number of columns in output table. |
type |
a character string to specify the type of pattern underlying the table. The options are |
n.tables |
a positive integer value specifying the number of tables to be generated. |
row.marginal |
a non-negative numeric vector of length |
col.marginal |
a non-negative numeric vector of length |
noise |
a numeric value between 0 and 1 specifying the noise level to be added to a table using function |
noise.model |
a character string indicating the noise model of either |
margin |
a numeric value of either 0, 1 or 2. Default is 0.
0: noise is applied along both rows and columns.
1: noise is applied along each row.
2: noise is applied along each column.
See |
Details
This function generates five types of table representing different interaction patterns between row and column discrete random variables X
and Y
. Three of the five types are non-constant functional patterns (Y
is a non-constant function of X
):
type="functional"
: Y
is a function of X
but X
may or may not be a function of Y
.
type="many.to.one"
: Y
is a many-to-one function of X
but X
is not a function of Y
.
type="discontinuous"
: Y
is a function of X
, where the function value of X must differ from its neighbors. X
may or may not be a function of Y
. A discontinuous function forms a contrast with those that are close to constant functions.
The fourth type
"dependent.non.functional"
is non-functional patterns where X
and Y
are statistically dependent but not function of each other. The samples are distributed according to row.marginal
probabilities.
The fifth type
"independent"
represents patterns where X
and Y
are statistically independent whose joint probability mass function is the product of their marginal probability mass functions.
For all functional tables (type="functional"
, type="many.to.one"
, type="discontinuous"
), the samples are distributed using either the given row or column marginal probabilities. Theoretically, it is not always possible to enforce both marginals in a functional pattern. If both marginals are provided, one will be randomly selected to generate a table; about half of the time each equested marginal is used. If neither is provided, either row or column uniform marginal will be randomly selected to generate a table; half of the time a table will have a uniform row marginal and the other half a uniform column marginal.
Random noise can be optionally applied to the tables using either the house or the candle noise model. See add.noise
for details.
Sharma et al. (2017) provide full mathematical and statistical details of the simulation strategies for the above table types except the "discontinuous"
type which was introduced after the publication.
Value
A list containing the following components:
pattern.list |
a list of tables containing binary patterns in 0's and 1's. Each table is created by setting all non-zero entries in the corresponding sampled contingency table from |
sample.list |
a list of tables satisfying both the mathematical and statistical requirements. These tables are noise free. |
noise.list |
a list of tables after applying noise to the corresponding tables in |
pvalue.list |
a list of p-values reporting the statistical significance of the generated tables for the required type. When the pattern type specifies a functional relationship, the p-values are computed by the functional chi-square test (Zhang and Song 2013); otherwise, the Pearson's chi-square test of independence is used to calculate the p-value. |
Author(s)
Ruby Sharma, Sajal Kumar, Hua Zhong, and Joe Song
References
Sharma R, Kumar S, Zhong H, Song M (2017).
“Simulating noisy, nonparametric, and multivariate discrete patterns.”
The R Journal, 9(2), 366–377.
doi:10.32614/RJ-2017-053.
Zhang Y, Liu ZL, Song M (2015).
“ChiNet uncovers rewired transcription subnetworks in tolerant yeast for advanced biofuels conversion.”
Nucleic Acids Research, 43(9), 4393–4407.
doi:10.1093/nar/gkv358.
Zhang Y, Song M (2013).
“Deciphering interactions in causal networks without parametric assumptions.”
arXiv Molecular Networks, arXiv:1311.2707.
https://arxiv.org/abs/1311.2707.
See Also
add.noise
for details of the noise model.
Examples
# In all examples, x is the row variable and y is the column
# variable of a table.
# Example 1. Simulating a noisy function where y=f(x),
# x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.2, n.tables = 1,
row.marginal = c(0.3,0.2,0.3,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 1. Functional pattern")
plot_table(tbls$sample.list[[1]], main="Ex 1. Sampled pattern (noise free)")
plot_table(tbls$noise.list[[1]], main="Ex 1. Sampled pattern with 0.2 noise")
plot.new()
# Example 2. Simulating a noisy functional pattern where
# y=f(x), x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.5, n.tables = 1,
row.marginal = c(0.3,0.2,0.3,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 2. Functioal pattern", col="seagreen2")
plot_table(tbls$sample.list[[1]], main="Ex 2. Sampled pattern (noise free)", col="seagreen2")
plot_table(tbls$noise.list[[1]], main="Ex 2. Sampled pattern with 0.5 noise", col="seagreen2")
plot.new()
# Example 3. Simulating a noisy many.to.one function where
# y=f(x), x!=f(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="many.to.one",
noise=0.2, n.tables = 1,
row.marginal = c(0.4,0.3,0.1,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 3. Many-to-one pattern", col="limegreen")
plot_table(tbls$sample.list[[1]], main="Ex 3. Sampled pattern (noise free)", col="limegreen")
plot_table(tbls$noise.list[[1]], main="Ex 3. Sampled pattern with 0.2 noise", col="limegreen")
plot.new()
# Example 4. Simulating noisy discontinuous
# pattern where y=f(x), x may or may not be g(y) with given row.marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=5,
type="discontinuous", noise=0.2,
n.tables = 1, row.marginal = c(0.2,0.4,0.2,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 4. Discontinuous pattern", col="springgreen3")
plot_table(tbls$sample.list[[1]], main="Ex 4. Sampled pattern (noise free)", col="springgreen3")
plot_table(tbls$noise.list[[1]], main="Ex 4. Sampled pattern with 0.2 noise", col="springgreen3")
plot.new()
# Example 5. Simulating noisy dependent.non.functional
# pattern where y!=f(x) and x and y are statistically
# dependent.
tbls <- simulate_tables(n=100, nrow=4, ncol=5,
type="dependent.non.functional", noise=0.3,
n.tables = 1, row.marginal = c(0.2,0.4,0.2,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 5. Dependent.non.functional pattern",
col="sienna2", highlight="none")
plot_table(tbls$sample.list[[1]], main="Ex 5. Sampled pattern (noise free)",
col="sienna2", highlight="none")
plot_table(tbls$noise.list[[1]], main="Ex 5. Sampled pattern with 0.3 noise",
col="sienna2", highlight="none")
plot.new()
# Example 6. Simulating a pattern where x and y are
# statistically independent.
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="independent",
noise=0.3, n.tables = 1,
row.marginal = c(0.4,0.3,0.1,0.2),
col.marginal = c(0.1,0.2,0.4,0.2,0.1))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 6. Independent pattern",
col="cornflowerblue", highlight="none")
plot_table(tbls$sample.list[[1]], main="Ex 6. Sampled pattern (noise free)",
col="cornflowerblue", highlight="none")
plot_table(tbls$noise.list[[1]], main="Ex 6. Sampled pattern with 0.3 noise",
col="cornflowerblue", highlight="none")
plot.new()
# Example 7. Simulating a noisy function where y=f(x),
# x may or may not be g(y), with given column marginal
tbls <- simulate_tables(n=100, nrow=4, ncol=5, type="functional",
noise=0.2, n.tables = 1,
col.marginal = c(0.2,0.1,0.4,0.2,0.1))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 7. Functional pattern")
plot_table(tbls$sample.list[[1]], main="Ex 7. Sampled pattern (noise free)")
plot_table(tbls$noise.list[[1]], main="Ex 7. Sampled pattern with 0.2 noise")
plot.new()
# Example 8. Simulating a noisy many.to.one function where
# y=f(x), x!=f(y) with given column marginal.
tbls <- simulate_tables(n=100, nrow=4, ncol=4, type="many.to.one",
noise=0.2, n.tables = 1,
col.marginal = c(0.4,0.3,0.1,0.2))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 8. Many-to-one pattern", col="limegreen")
plot_table(tbls$sample.list[[1]], main="Ex 8. Sampled pattern (noise free)", col="limegreen")
plot_table(tbls$noise.list[[1]], main="Ex 8. Sampled pattern with 0.2 noise", col="limegreen")
plot.new()
# Example 9. Simulating noisy discontinuous
# pattern where y=f(x), x may or may not be g(y) with given column marginal
tbls <- simulate_tables(n=100, nrow=4, ncol=4,
type="discontinuous", noise=0.2,
n.tables = 1, col.marginal = c(0.1,0.4,0.2,0.3))
par(mfrow=c(2,2))
plot_table(tbls$pattern.list[[1]], main="Ex 9. Discontinuous pattern", col="springgreen3")
plot_table(tbls$sample.list[[1]], main="Ex 9. Sampled pattern (noise free)", col="springgreen3")
plot_table(tbls$noise.list[[1]], main="Ex 9. Sampled pattern with 0.2 noise", col="springgreen3")
plot.new()