rg.test {rgTest} | R Documentation |
Robust graph-based two sample test
Description
Performs robust graph-based two sample test.
Usage
rg.test(data.X, data.Y, dis = NULL, E = NULL, n1, n2, k = 5, weigh.fun, perm.num = 0,
test.type = list("ori", "gen", "wei", "max"), progress_bar = FALSE)
Arguments
data.X |
a numeric matrix for observations in sample 1. |
data.Y |
a numeric matrix for observations in sample 2. |
dis |
a distance matrix of the pooled dataset of sample 1 and sample 2. The indices of observations in sample 1 are from 1 to n1 and indices of observations in sample 2 are from 1+n1 to n1+n2 in the pooled dataset. |
E |
an edge matrix representing a similarity graph. Each row represents an edge and records the indices of two ends of an edge in two columns. The indices of observations in sample 1 are from 1 to n1 and indices of observations in sample 2 are from 1+n1 to n1+n2. |
n1 |
number of observations in sample 1. |
n2 |
number of observations in sample 2. |
k |
parameter in K-MST, with default 5. |
weigh.fun |
weighted function which returns weights of each edge and is a function of node degrees. |
perm.num |
number of permutations used to calculate the p-value (default=1000). Use 0 for getting only the approximate p-value based on asymptotic theory. |
test.type |
type of graph-based test. This must be a list containing elements chosen from "ori", "gen", "wei", and "max", with default 'list("ori", "gen", "wei", "max")'. "ori" refers to robust orignial edge-count test, "gen" refers to robust generalized edge-count test, "wei" refers to robust weighted edge-count test and "max" refers to robust max-type edge-count tests. |
progress_bar |
a logical evaluating to TRUE or FALSE indicating whether a progress bar of the permutation should be printed. |
Details
The input should be one of the following:
datasets of the two samples;
the distance matrix of the pooled dataset;
the edge matrix generated from a similarity graph.
Typical usages are:
rg.test(data.X, data.Y, n1, n2, weigh.fun, ...)
rg.test(dis, n1, n2, weigh.fun, ...)
rg.test(E, n1, n2, weigh.fun, ...)
If the data matrices or the distance matrix are used, the similarity graph is generated using K-MST.
Value
A list containing the following components:
asy.ori.statistic |
the asymptotic test statistic using robust original graph-based test. |
asy.ori.pval |
the asymptotic p-value using robust original graph-based test. |
asy.gen.statistic |
the asymptotic test statistic using robust generalized graph-based test. |
asy.gen.pval |
the asymptotic p-value using robust generalized graph-based test. |
asy.wei.statistic |
the asymptotic test statistic using robust weighted graph-based test. |
asy.wei.pval |
the asymptotic p-value using robust weighted graph-based test. |
asy.max.statistic |
the asymptotic test statistic using robust max-type graph-based test. |
asy.max.pval |
the asymptotic p-value using robust max-type graph-based test. |
perm.ori.pval |
the p-value based on permutation using robust original graph-based test. |
perm.gen.pval |
the p-value based on permutation using robust generalized graph-based test. |
perm.wei.pval |
the p-value based on permutation using robust weighted graph-based test. |
perm.max.pval |
the p-value based on permutation using robust max-type graph-based test. |
Examples
## Simulated from Student's t-distribution.
## Observations for the two samples are from different distributions.
data(example0)
data = as.matrix(example0$data) # pooled dataset
label = example0$label # label of observations
s1 = data[label == 'sample 1', ] # sample 1
s2 = data[label == 'sample 2', ] # sample 2
num1 = nrow(s1) # number of observations in sample 1
num2 = nrow(s2) # number of observations in sample 2
## Graph-based two sample test using data as input
rg.test(data.X = s1, data.Y = s2, n1 = num1, n2 = num2, k = 5, weigh.fun = weiMax, perm.num = 0)
## Graph-based two sample test using distance matrix as input
dist = example0$distance
rg.test(dis = dist, n1 = num1, n2 = num2, k = 5, weigh.fun = weiMax, perm.num = 0)
## Graph-based two sample test using edge matrix of the similarity graph as input
E = example0$edge
rg.test(E = E, n1 = num1, n2 = num2, weigh.fun = weiMax, perm.num = 0)