prop_compare {quest} | R Documentation |
Proportion Comparisons for a Single Variable across 3+ Independent Groups (Chi-square Test of Independence)
Description
prop_compare
tests for proportion differences across 3+ independent
groups with a chi-square test of independence. The function also calculates
the descriptive statistics for each group, Cramer's V and its confidence
interval as a standardized effect size, and can provide the X by 2
contingency tables. prop_compare
is simply a wrapper for
prop.test
plus some extra calculations.
Usage
prop_compare(
x,
nom,
lvl = levels(as.factor(nom)),
yates = TRUE,
ci.level = 0.95,
rtn.table = TRUE,
check = TRUE
)
Arguments
x |
numeric vector that only has values of 0 or 1 (or missing values), otherwise known as a dummy variable. |
nom |
atomic vector that takes on three or more unordered values (or missing values), otherwise known as a nominal variable. |
lvl |
character vector with length 2 specifying the unique values for
the two groups. If |
yates |
logical vector of length 1 specifying whether the Yate's
continuity correction should be applied for small samples. See
|
ci.level |
numeric vector of length 1 specifying the confidence level.
|
rtn.table |
logical vector of lengh 1 specifying whether the return object should include the X by 2 contingency table of counts with totals and the X by 2 overall percentages table. If TRUE, then the last two elements of the return object are "count" containing a matrix of counts and "percent" containing a matrix of overall percentages. |
check |
logical vector of length 1 specifying whether the input
arguments should be checked for errors. For example, if |
Details
The confidence interval for Cramer's V is calculated with fisher's r to z transformation as Cramer's V is a kind of multiple correlation coefficient. Cramer's V is transformed to fisher's z units, a symmetric confidence interval for fisher's z is calculated, and then the lower and upper bounds are back-transformed to Cramer's V units.
Value
list of numeric vectors containing statistical information about the
proportion comparisons: 1) nhst = chi-square test of independence stat info
in a numeric vector, 2) desc = descriptive statistics stat info in a
numeric vector, 3) std = standardized effect size and its confidence
interval in a numeric vector, 4) count = numeric matrix with dim =
[X+1, 3]
of the X by 2 contingency table of counts with an
additional row and column for totals (if rtn.table
= TRUE), 5)
percent = numeric matrix with dim = [X+1, 3]
of the X by 2
contingency table of overall percentages with an additional row and column
for totals (if rtn.table
= TRUE).
1) nhst = chi-square test of independence stat info in a numeric vector
- est
average proportion difference absolute value (i.e., |group j - group i|)
- se
NA (to remind the user there is no standard error for the test)
- X2
chi-square value
- df
degrees of freedom (of the nominal variable)
- p
two-sided p-value
2) desc = descriptive statistics stat info in a numeric vector (note there could be more than 3 groups - groups i, j, and k are just provided as an example):
- prop_'lvl[k]'
proportion of group k
- prop_'lvl[j]'
proportion of group j
- prop_'lvl[i]'
proportion of group i
- sd_'lvl[k]'
standard deviation of group k
- sd_'lvl[j]'
standard deviation of group j
- sd_'lvl[i]'
standard deviation of group i
- n_'lvl[k]'
sample size of group k
- n_'lvl[j]'
sample size of group j
- n_'lvl[i]'
sample size of group i
3) std = standardized effect size and its confidence interval in a numeric vector
- cramer
Cramer's V estimate
- lwr
lower bound of Cramer's V confidence interval
- upr
upper bound of Cramer's V confidence interval
4) count = numeric matrix with dim = [X+1, 3]
of the X by 2
contingency table of counts with an additional row and column for totals (if
rtn.table
= TRUE).
The 3+ unique observed values of nom
- plus the total - are the rows
and the two unique observed values of x
(i.e., 0 and 1) - plus the
total - are the columns. The dimlabels are "nom" for the rows and "x" for the
columns. The rownames are 1. 'lvl[i]', 2. 'lvl[j]', 3. 'lvl[k]', 4. "total".
The colnames are 1. "0", 2. "1", 3. "total".
5) percent = numeric matrix with dim = [X+1, 3]
of the X by 2
contingency table of overall percentages with an additional row and column
for totals (if rtn.table
= TRUE).
The 3+ unique observed values of nom
- plus the total - are the rows
and the two unique observed values of x
(i.e., 0 and 1) - plus the
total - are the columns. The dimlabels are "nom" for the rows and "x" for the
columns. The rownames are 1. 'lvl[i]', 2. 'lvl[j]', 3. 'lvl[k]', 4. "total".
The rownames are 1. "0", 2. "1", 3. "total".
See Also
prop.test
the workhorse for prop_compare
,
props_compare
for multiple dummy variables,
prop_diff
for only 2 independent groups (aka binary variable),
Examples
tmp <- replicate(n = 10, expr = mtcars, simplify = FALSE)
mtcars2 <- str2str::ld2d(tmp)
mtcars2$"cyl_fct" <- car::recode(mtcars2$"cyl",
recodes = "4='four'; 6='six'; 8='eight'", as.factor = TRUE)
prop_compare(x = mtcars2$"am", nom = mtcars2$"cyl_fct")
prop_compare(x = mtcars2$"am", nom = mtcars2$"cyl_fct",
lvl = c("four","six","eight")) # specify order of levels in return object
# more than 3 groups
prop_compare(x = ifelse(airquality$"Wind" >= 10, yes = 1, no = 0), nom = airquality$"Month")
prop_compare(x = ifelse(airquality$"Wind" >= 10, yes = 1, no = 0), nom = airquality$"Month",
rtn.table = FALSE) # no contingency tables