pval_cat_fn {reappraised} | R Documentation |
Compares observed and expected distribution of p-values for categorical variables
Description
Creates plots of calculated p-value distribution and AUC (area under curve)
Usage
pval_cat_fn(
df = pval_cat_data,
seed = 0,
sims = -1,
btsp = 500,
title = "",
stat = "chi_midp",
stat.override = "no",
fisher.sim = "y",
fish.n.sims = 10000,
method = "mix",
verbose = TRUE
)
Arguments
df |
data frame generated from load_clean function |
seed |
the seed to use for random number generation, default 0 = current date and time. Specify seed to make repeatable. |
sims |
number of simulations, default -1 = function selects based on number of variables. |
btsp |
number of bootstrap repeats used to generate 95% confidence interval around AUC |
title |
optional title for plots |
stat |
statistical test to be used 'chisq', 'fisher', 'midp' or 'midp.epitools' (from epitools package), 'midp.sas' (as calculated in SAS), or combinations -if chisq is not appropriate because expected cells<5, use second test: 'chi_fish', 'chi_midp' or 'chi_midp.epi','chi_midp.sas' |
stat.override |
if 'yes' then test specified in stat will be used rather than values for stat in data frame |
fisher.sim |
"yes" or "no" indicator whether to allow fisher test to simulate p-values for >2*2 tables |
fish.n.sims |
number of simulations to use in Fisher test, default 10,000 |
method |
'sm', 'mix', or 'ind'. 'ind' does test on individual data, 'sm' summarises data and then does test on summary data, 'mix' does 'ind' for fisher and 'sm' for others. Duration varies with size of studies, test, and number of simulations. Experiment before running large simulations. |
verbose |
TRUE or FALSE indicates whether progress bar and comments show and prints plot |
Details
See also Bolland MJ, Gamble GD, Avenell A, Grey A, Lumley T. Baseline P value distributions in randomized trials were uniform for continuous but not categorical variables. J Clin Epidemiol 2019;112:67-76.
Returns a list containing 3 objects and (if verbose = TRUE) prints the plot pval_cat_calculated_pvalues
Value
list containing 3 objects as described
pval_cat_calculated_pvalues = plots of calculated p-value distribution and AUC
pval_cat_reported_pvalues = plots of reported p-value distribution and AUC (if p-values were reported)
all_results = list containing
pval_cat_baseline_pvalues_data = data frame of all results used in calculations
pval_cat_reported_pvalues= plot of reported p-value distribution
pval_cat_auc_reported_pvalues = AUC of reported p-values
pval_cat_calculated_pvalues = plot of calculated p-value distribution
pval_cat_auc_calculated_pvalues= AUC of calculated p-values
Examples
# load example data
pval_cat_data <- load_clean(import= "no", file.cat = "SI_cat_all", pval_cat= "yes",
format.cont = "wide")$pval_cat_data
# run function (takes a few seconds)
pval_cat_fn(seed=10, sims = 50, btsp = 100)$pval_cat_calculated_pvalues
# to import an excel spreadsheet (modify using local path,
# file and sheet name, range, and format):
# get path for example files
path <- system.file("extdata", "reappraised_examples.xlsx", package = "reappraised",
mustWork = TRUE)
# delete file name from path
path <- sub("/[^/]+$", "", path)
# load data
pval_cat_data <- load_clean(import= "yes", pval_cat = "yes", dir = path,
file.name.cat = "reappraised_examples.xlsx", sheet.name.cat = "SI_cat_all",
range.name.cat = "A:n", format.cat = "wide")$pval_cat_data