cat_all_fn {reappraised} | R Documentation |
Compares observed and expected distribution of all categorical (binomial) variables
Description
Creates plots of observed to expected numbers and ratios for the binomial variables and/or compares reported and calculated p-values for the variables
Reference: Bolland MJ, Gamble GD, Avenell A, Cooper DJ, Grey A. Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns. J Clin Epidemiol. 2023;154:117-124
Usage
cat_all_fn(
df = cat_all_data,
comp.pvals = "no",
fisher.sim = "y",
fish.n.sims = 10000,
binom = "no",
two_levels = "no",
del.disparate = "yes",
excl.level = "yes",
seed = 0,
title = "",
verbose = TRUE
)
Arguments
df |
data frame generated from load_clean function |
comp.pvals |
"yes" or "no" indicator whether reported and calculated p-values should be compared |
fisher.sim |
"yes" or "no" indicator whether to allow fisher test to simulate p-values for >2*2 tables |
fish.n.sims |
number of simulations to use in Fisher test, default 10,000 |
binom |
"yes" or "no" indicator whether observed to expected distributions of binomial variables should be calculated |
two_levels |
"yes" or "no" indicator whether variables with more than 2 levels should be collapsed to 2 levels |
del.disparate |
if yes, data in which the absolute difference between group sizes is >20% are deleted |
excl.level |
"yes" or "no" indicator whether one level of a variable should be deleted. Deleted level is chosen randomly using seed parameter. |
seed |
seed for random number generator, default 0 = current date and time. Specify seed to make repeatable. |
title |
title name for plots (optional) |
verbose |
TRUE or FALSE indicates whether progress bar and comments show and flextable or plot or both are printed |
Details
Returns a list containing objects described below and (if verbose = TRUE) prints the flextable cat_all_diff_calc_rep_ft and/or graph cat_all_graph depending on options chosen
Value
list containing objects as described
if p-value comparison used:
cat_all_pvals = data frame of data for comparison of reported and calculated p-values
cat_all_diff_calc_rep_ft = flextable of comparison of reported and calculated p-values
cat_all_diff_calc_rep_data = data frame used to make flextable
cat_all_diff_thresh_ft = flextable of comparison of reported and calculated p-values when only threshold given
cat_all_diff_thresh_data = data frame used to make flextable for p-value thresholds
if comparing categorical variables used
cat_all_graph = plot of observed to expected numbers and differences between groups, top panels are the absolute numbers, bottom panels are the differences between trial arms in two arm studies
cat_all_graph_pc = plot of observed to expected numbers expressed as percentages and differences between groups, top panels are the percentages, bottom panels are the differences between trial arms in two arm studies
cat_all_data_abs = data frame of data for absolute numbers
cat_all_data_df = data frame of data for difference between groups in two arm studies
cat_all_dataset_abs = data frame of dataset used for all trials
cat_all_dataset_df = data frame of dataset used for two arm trials
cat_all_all_graphs list containing
abs = plot for absolute numbers only
df = plot for difference between groups in two arm studies only
pc = plot for percentages only
all_pc = composite plot of percentages and absolute numbers
individual_graphs list of 6 individual plots making up composite figures
Examples
# load example data
cat_all_data <- load_clean(import= "no", file.cat = "SI_cat_all", cat_all= "yes",
format.cat = "wide")$cat_all_data
# run function comparing p-values only (takes only a few seconds)
cat_all_fn (comp.pvals = "yes")$cat_all_diff_calc_rep_ft
# run function comparing distribution of binomial variables only
# to speed example up limit to 12 2-arm trials with 20 variables
# (takes close to 5 secs)
cat_all_data <- cat_all_data [1:41, c(1:8,10:11,13:15)]
cat_all_fn (binom = "yes", two_levels = "yes", del.disparate = "yes",
excl.level = "yes", seed = 10)$cat_all_graph
# to import an excel spreadsheet (modify using local path,
# file and sheet name, range, and format):
# get path for example files
path <- system.file("extdata", "reappraised_examples.xlsx", package = "reappraised",
mustWork = TRUE)
# delete file name from path
path <- sub("/[^/]+$", "", path)
# load data
cat_all_data <- load_clean(import= "yes", cat_all = "yes", dir = path,
file.name.cat = "reappraised_examples.xlsx", sheet.name.cat = "SI_cat_all",
range.name.cat = "A:N", format.cat = "wide")$cat_all_data