props_diff {quest} | R Documentation |
Proportion Difference of Multiple Variables Across Two Independent Groups (Chi-square Tests of Independence)
Description
props_diff
tests the proportion difference of multiple variables
across two independent groups with chi-square tests of independence. The
function also calculates the descriptive statistics for each group, various
standardized effect sizes (e.g., Cramer's V), and can provide the 2x2
contingency tables. props_diff
is simply a wrapper for
prop.test
plus some extra calculations.
Usage
props_diff(
data,
vrb.nm,
bin.nm,
lvl = levels(as.factor(data[[bin.nm]])),
yates = TRUE,
zero.cell = 0.05,
smooth = TRUE,
ci.level = 0.95,
rtn.table = TRUE,
check = TRUE
)
Arguments
data |
data.frame of data. |
vrb.nm |
character vector specifying the colnames in |
bin.nm |
character vector of length 1 specifying the colname in |
lvl |
character vector with length 2 specifying the unique values for
the two groups. If |
yates |
logical vector of length 1 specifying whether the Yate's
continuity correction should be applied for small samples. See
|
zero.cell |
numeric vector of length 1 specifying what value to impute
for zero cell counts in the 2x2 contingency table when computing the
tetrachoric correlations. See |
smooth |
logical vector of length 1 specifying whether a smoothing
algorithm should be applied when estimating the tetrachoric correlations.
See |
ci.level |
numeric vector of length 1 specifying the confidence level.
|
rtn.table |
logical vector of lengh 1 specifying whether the return object should include the 2x2 contingency table of counts with totals and the 2x2 overall percentages table. If TRUE, then the last two elements of the return object are "count" containing a 3D array of counts and "percent" containing a 3D array of overall percentages. |
check |
logical vector of length 1 specifying whether the input
arguments should be checked for errors. For example, if
|
Value
list of data.frames containing statistical information about the prop
differences (the rownames of each data.frame are vrb.nm
): 1)
chisqtest = chi-square tests of independence stat info in a data.frame, 2)
describes = descriptive statistics stat info in a data.frame, 3) effects =
various standardized effect sizes in a data.frame, 4) count = numeric 3D
array with dim = [3, 3, length(vrb.nm)]
of the 2x2 contingency
tables of counts with additional rows and columns for totals (if
rtn.table
= TRUE), 5) percent = numeric 3D array with dim =
[3, 3, length(vrb.nm)]
of the 2x2 contingency tables of overall
percentages with additional rows and columns for totals (if
rtn.table
= TRUE).
1) chisqtest = chi-square tests of independence stat info in a data.frame
- est
mean difference estimate (i.e., group 2 - group 1)
- se
NA (to remind the user there is no standard error for the test)
- X2
chi-square value
- df
degrees of freedom (will always be 1)
- p
two-sided p-value
- lwr
lower bound of the confidence interval
- upr
upper bound of the confidence interval
2) describes = descriptive statistics stat info in a data.frame
- prop_'lvl[2]'
proportion of group 2
- prop_'lvl[1]'
proportion of group 1
- sd_'lvl[2]'
standard deviation of group 2
- sd_'lvl[1]'
standard deviation of group 1
- n_'lvl[2]'
sample size of group 2
- n_'lvl[1]'
sample size of group 1
3) effects = various standardized effect sizes in a data.frame
- cramer
Cramer's V estimate
- h
Cohen's h estimate
- phi
Phi coefficient estimate
- yule
Yule coefficient estimate
- tetra
Tetrachoric correlation estimate
- OR
odds ratio estimate
- RR
risk ratio estimate calculated as (i.e., group 2 / group 1). Note this value will often differ when recoding variables (as it should).
4) count = numeric 3D array with dim = [3, 3, length(vrb.nm)]
of the
2x2 contingency tables of counts with additional rows and columns for totals
(if rtn.table
= TRUE).
The two unique observed values of data[vrb.nm]
(i.e., 0 and 1) -
plus the total - are the rows and the two unique observed values of
data[[bin.nm]]
- plus the total - are the columns. The variables
themselves as the layers (i.e., 3rd dimension of the array). The dimlabels
are "bin" for the rows, "x" for the columns, and "vrb" for the layers. The
rownames are 1. "0", 2. "1", 3. "total". The colnames are 1. 'lvl[1]', 2.
'lvl[2]', 3. "total". The laynames are vrb.nm
.
5) percent = numeric 3D array with dim = [3, 3, length(vrb.nm)]
of the
2x2 contingency tables of overall percentages with additional rows and
columns for totals (if rtn.table
= TRUE).
The two unique observed values of data[vrb.nm]
(i.e., 0 and 1) -
plus the total - are the rows and the two unique observed values of
data[[bin]]
- plus the total - are the columns. The variables
themselves as the layers (i.e., 3rd dimension of the array). The dimlabels
are "bin" for the rows, "x" for the columns, and "vrb" for the layers. The
rownames are 1. "0", 2. "1", 3. "total". The colnames are 1. 'lvl[1]', 2.
'lvl[2]', 3. "total". The laynames are vrb.nm
.
See Also
prop.test
the workhorse for props_diff
,
prop_diff
for a single dummy variable,
phi
for another phi coefficient function
Yule
for another yule coefficient function
tetrachoric
for another tetrachoric coefficient function
Examples
# rtn.table = TRUE (default)
# multiple variables
mtcars2 <- mtcars
mtcars2$"vs_bin" <- ifelse(mtcars$"vs" == 1, yes = "yes", no = "no")
mtcars2$"gear_dum" <- ifelse(mtcars2$"gear" > 3, yes = 1L, no = 0L)
mtcars2$"carb_dum" <- ifelse(mtcars2$"carb" > 3, yes = 1L, no = 0L)
vrb_nm <- c("am","gear_dum","carb_dum") # dummy variables
lapply(X = vrb_nm, FUN = function(nm) {
tmp <- c("vs_bin", nm)
table(mtcars2[tmp])
})
props_diff(data = mtcars2, vrb.nm = c("am","gear_dum","carb_dum"), bin.nm = "vs_bin")
# single variable
props_diff(mtcars2, vrb.nm = "am", bin.nm = "vs_bin")
# rtn.table = FALSE (no "count" or "percent" list elements)
# multiple variables
props_diff(data = mtcars2, vrb.nm = c("am","gear_dum","carb_dum"), bin.nm = "vs",
rtn.table = FALSE)
# single variable
props_diff(mtcars, vrb.nm = "am", bin.nm = "vs",
rtn.table = FALSE)