testforDEP {testforDEP} | R Documentation |
Test dependence for two data
Description
This function computes test statistic, p value, and confidence interval for dependence based on classic methods: Pearson, Kendall, Spearman, and modern methods: Vexler, Kallenberg, MIC, Hoeffding, and Empirical Likelihood tests.
Usage
testforDEP(x = NA, y = NA, data = NA, test, p.opt = "MC",
num.MC = 10000, BS.CI = 0, rm.na = FALSE, set.seed = FALSE)
Arguments
x |
a numeric vector stores first variable. |
y |
numeric vector stores second variable. |
data |
(Optional) a data frame stores data to be tested. |
test |
a character indicating which test to implement.. Must be one of {"PEARSON", "KENDALL", "SPEARMAN", "VEXLER", "TS2", "V", "MIC", "HOEFFD", "EL"} |
p.opt |
a character specifying p value to be obtained by distribution or by Monte Carlo simulation. Must be "dist", "MC" or "table". |
num.MC |
a numeric for number of Monte Carlo simulations. |
BS.CI |
a numeric specifying alpha for Bootstrap confidence interval. When equal 0, confidence interval won't be computed. |
rm.na |
a TRUE/ FALSE flag indicating whether remove missing data (NA) in input. |
set.seed |
a TRUE/ FALSE flag indicating whether set seed for Monte Carlo simulation and bootstrap sampling. |
Details
Argument "x, y" and "data" are two different ways to input data. When x or y is missing, data will be taken as input; while x, y and data all exist leads to error. Argument data is a two-column numeric data frame. The order of columns does not affect results. Since modern test methods: "VEXLER", "TS2", "V", "MIC", "HOEFFD", and "EL" have no continuous probability density function, argument p.opt = "dist" does not apply. For classic methods, when p.opt is "dist", argument num.MC will be ignored. p.opt = "table" use interpolation from pre stored simulated tables. Current version only supports "VEXLER", "MIC", "HOEFFD" and "EL" tests. For Vexler, MIC and EL, since computation is more time-consuming, a warning with estimated execution time will be returned when input size > 100. Input size <= 100 is recommanded for Monte Carlo p-value. For input size > 100 use table. num.MC should be a integer between 100 and 10,000 for acceptable computation times. NA in input is not acceptable. Set rm.na = TRUE to remove. More details see Pearson, Kendall, Spearman, Vexler, Kallenberg, MIC, Hoeffding, EL.
Value
an S4 object of class "testforDEP_result", having attributes: test statistics (TS), p value (p_value) and confidence interval (CI) if apply.
Author(s)
Jeffrey C. Miecznikowski, En-shuo Hsu, Yanhua Chen, Albert Vexler
See Also
Technical report: http://sphhp.buffalo.edu/content/dam/sphhp/biostatistics/Documents/techreports/UB-Biostatistics-TR1701.pdf
Examples
set.seed(123)
x = runif(100, 0, 1)
y = runif(100, 0, 1)
testforDEP(x, y, test = "SPEARMAN", p.opt = "MC",
num.MC = 10000, BS.CI = 0, set.seed = TRUE)
#An object of class "testforDEP_result"
#Slot "TS":
#[1] 59.54311
#Slot "p_value":
#[1] 0.6735326
#Slot "CI":
#list()