rowChisqStats {scrime}R Documentation

Rowwise Pearson's ChiSquare Statistic

Description

Computes for each row of a matrix the value of Pearson's ChiSquare statistic for testing if the corresponding categorical variable is associated with a (categorical) response, or determines for each pair of rows of a matrix the value of Pearson's ChiSquare statistic for testing if the two corresponding variables are independent.

Usage

rowChisqStats(data, cl, compPval = TRUE, asMatrix = TRUE)

Arguments

data

a numeric matrix consisting of the integers between 1 and n_{cat}, where n_{cat} is the maximum number of levels the categorical variables can take. Each row of data must correspond to a variable, each row to an observation. Missing values and different numbers of levels a variable might take are allowed.

cl

a numeric vector of length ncol(data) containing the class labels for the observations represented by the columns of data. The class labels must be coded by the integers between 1 and n_{cl}, where n_{cl} is the number of classes. If missing, the value of the statistic for Pearson's \chi^2-test of independence will be computed for each pair of rows of data. Otherwise, the value of Pearson's \chi^2-statistic for testing if the distribution of the variable differs between the groups specified by cl will be determined for each row of data.

compPval

should also the p-value (based on the approximation to a \chi^2-distribution) be computed?

asMatrix

should the pairwise test scores be returned as matrix? Ignored if cl is specified. If TRUE, a matrix with m rows and columns is returned that contains the values of Pearson's \chi^2-statistic in its lower triangle, where m is the number of variables. If FALSE, a vector of length m * (m - 1) / 2 is returned, where the value for testing the ith and jth variable is given by the j + m * (i - 1) - i * (i - 1) / 2 element of this vector.

Value

If compPval = FALSE, a vector (or matrix if cl is not specified and as.matrix = TRUE) composed of the values of Pearson's \chi^2-statistic. Otherwise, a list consisting of

stats

a vector (or matrix) containing the values of Pearson's \chi^2-statistic.

df

a vector (or matrix) comprising the degrees of freedom of the asymptotic \chi^2-distribution.

rawp

a vector (or matrix) containing the (unadjusted) p-values.

Note

Contrary to chisq.test, currently no continuity correction is done for 2 x 2 tables.

Author(s)

Holger Schwender, holger.schwender@udo.edu

References

Schwender, H.\ (2007). A Note on the Simultaneous Computation of Thousands of Pearson's \chi^2-Statistics. Technical Report, SFB 475, Deparment of Statistics, University of Dortmund.

See Also

computeContCells, computeContClass

Examples

## Not run: 
# Generate an example data set consisting of 5 rows (variables)
# and 200 columns (observations) by randomly drawing integers 
# between 1 and 3.

mat <- matrix(sample(3, 1000, TRUE), 5)
rownames(mat) <- paste("SNP", 1:5, sep = "")

# For each pair of rows of mat, test if they are independent.

r1 <- rowChisqStats(mat)

# The values of Pearson's ChiSquare statistic as matrix.

r1$stats

# And the corresponding (unadjusted) p-values.

r1$rawp

# Obtain only the values of the test statistic as vector

rowChisqStats(mat, compPval = FALSE, asMatrix =FALSE)


# Generate an example data set consisting of 10 rows (variables)
# and 200 columns (observations) by randomly drawing integers 
# between 1 and 3, and a vector of class labels of length 200
# indicating that the first 100 observation belong to class 1
# and the other 100 to class 2. 

mat2 <- matrix(sample(3, 2000, TRUE), 10)
cl <- rep(1:2, e = 100)

# For each row of mat2, test if they are associated with cl.

r2 <- rowChisqStats(mat2, cl)
r2$stats

# And the results are identical to the one of chisq.test
pv <- stat <- numeric(10)
for(i in 1:10){
    tmp <- chisq.test(mat2[i,], cl)
    pv[i] <- tmp$p.value
    stat[i] <- tmp$stat
}

all.equal(r2$stats, stat)
all.equal(r2$rawp, pv)


## End(Not run)

[Package scrime version 1.3.5 Index]