testforDEP {testforDEP}R Documentation

Test dependence for two data

Description

This function computes test statistic, p value, and confidence interval for dependence based on classic methods: Pearson, Kendall, Spearman, and modern methods: Vexler, Kallenberg, MIC, Hoeffding, and Empirical Likelihood tests.

Usage

testforDEP(x = NA, y = NA, data = NA, test, p.opt = "MC",
  num.MC = 10000, BS.CI = 0, rm.na = FALSE, set.seed = FALSE)

Arguments

x

a numeric vector stores first variable.

y

numeric vector stores second variable.

data

(Optional) a data frame stores data to be tested.

test

a character indicating which test to implement.. Must be one of {"PEARSON", "KENDALL", "SPEARMAN", "VEXLER", "TS2", "V", "MIC", "HOEFFD", "EL"}

p.opt

a character specifying p value to be obtained by distribution or by Monte Carlo simulation. Must be "dist", "MC" or "table".

num.MC

a numeric for number of Monte Carlo simulations.

BS.CI

a numeric specifying alpha for Bootstrap confidence interval. When equal 0, confidence interval won't be computed.

rm.na

a TRUE/ FALSE flag indicating whether remove missing data (NA) in input.

set.seed

a TRUE/ FALSE flag indicating whether set seed for Monte Carlo simulation and bootstrap sampling.

Details

Argument "x, y" and "data" are two different ways to input data. When x or y is missing, data will be taken as input; while x, y and data all exist leads to error. Argument data is a two-column numeric data frame. The order of columns does not affect results. Since modern test methods: "VEXLER", "TS2", "V", "MIC", "HOEFFD", and "EL" have no continuous probability density function, argument p.opt = "dist" does not apply. For classic methods, when p.opt is "dist", argument num.MC will be ignored. p.opt = "table" use interpolation from pre stored simulated tables. Current version only supports "VEXLER", "MIC", "HOEFFD" and "EL" tests. For Vexler, MIC and EL, since computation is more time-consuming, a warning with estimated execution time will be returned when input size > 100. Input size <= 100 is recommanded for Monte Carlo p-value. For input size > 100 use table. num.MC should be a integer between 100 and 10,000 for acceptable computation times. NA in input is not acceptable. Set rm.na = TRUE to remove. More details see Pearson, Kendall, Spearman, Vexler, Kallenberg, MIC, Hoeffding, EL.

Value

an S4 object of class "testforDEP_result", having attributes: test statistics (TS), p value (p_value) and confidence interval (CI) if apply.

Author(s)

Jeffrey C. Miecznikowski, En-shuo Hsu, Yanhua Chen, Albert Vexler

See Also

Technical report: http://sphhp.buffalo.edu/content/dam/sphhp/biostatistics/Documents/techreports/UB-Biostatistics-TR1701.pdf

Examples

set.seed(123)
x = runif(100, 0, 1)
y = runif(100, 0, 1)

testforDEP(x, y, test = "SPEARMAN", p.opt = "MC",
           num.MC = 10000, BS.CI = 0, set.seed = TRUE)


#An object of class "testforDEP_result"
#Slot "TS":
#[1] 59.54311

#Slot "p_value":
#[1] 0.6735326

#Slot "CI":
#list()


[Package testforDEP version 0.2.0 Index]