cat_all_fn {reappraised}R Documentation

Compares observed and expected distribution of all categorical (binomial) variables

Description

Creates plots of observed to expected numbers and ratios for the binomial variables and/or compares reported and calculated p-values for the variables
Reference: Bolland MJ, Gamble GD, Avenell A, Cooper DJ, Grey A. Distributions of baseline categorical variables were different from the expected distributions in randomized trials with integrity concerns. J Clin Epidemiol. 2023;154:117-124

Usage

cat_all_fn(
  df = cat_all_data,
  comp.pvals = "no",
  fisher.sim = "y",
  fish.n.sims = 10000,
  binom = "no",
  two_levels = "no",
  del.disparate = "yes",
  excl.level = "yes",
  seed = 0,
  title = "",
  verbose = TRUE
)

Arguments

df

data frame generated from load_clean function

comp.pvals

"yes" or "no" indicator whether reported and calculated p-values should be compared

fisher.sim

"yes" or "no" indicator whether to allow fisher test to simulate p-values for >2*2 tables

fish.n.sims

number of simulations to use in Fisher test, default 10,000

binom

"yes" or "no" indicator whether observed to expected distributions of binomial variables should be calculated

two_levels

"yes" or "no" indicator whether variables with more than 2 levels should be collapsed to 2 levels

del.disparate

if yes, data in which the absolute difference between group sizes is >20% are deleted

excl.level

"yes" or "no" indicator whether one level of a variable should be deleted. Deleted level is chosen randomly using seed parameter.

seed

seed for random number generator, default 0 = current date and time. Specify seed to make repeatable.

title

title name for plots (optional)

verbose

TRUE or FALSE indicates whether progress bar and comments show and flextable or plot or both are printed

Details

Returns a list containing objects described below and (if verbose = TRUE) prints the flextable cat_all_diff_calc_rep_ft and/or graph cat_all_graph depending on options chosen

Value

list containing objects as described

if p-value comparison used:

if comparing categorical variables used

Examples

# load example data
cat_all_data <- load_clean(import= "no", file.cat = "SI_cat_all", cat_all= "yes",
format.cat = "wide")$cat_all_data


# run function comparing p-values only (takes only a few seconds)
cat_all_fn (comp.pvals = "yes")$cat_all_diff_calc_rep_ft

# run function comparing distribution of binomial variables only

# to speed example up limit to 12 2-arm trials with 20 variables
# (takes close to 5 secs)

cat_all_data <- cat_all_data [1:41, c(1:8,10:11,13:15)]

cat_all_fn (binom = "yes", two_levels = "yes", del.disparate = "yes",
excl.level = "yes", seed = 10)$cat_all_graph


# to import an excel spreadsheet (modify using local path,
# file and sheet name, range, and format):

# get path for example files
path <- system.file("extdata", "reappraised_examples.xlsx", package = "reappraised",
                   mustWork = TRUE)
# delete file name from path
path <- sub("/[^/]+$", "", path)

# load data
cat_all_data <- load_clean(import= "yes", cat_all = "yes", dir = path,
   file.name.cat = "reappraised_examples.xlsx", sheet.name.cat = "SI_cat_all",
   range.name.cat = "A:N", format.cat = "wide")$cat_all_data


[Package reappraised version 0.1.1 Index]