anova_fn {reappraised} | R Documentation |
Compares differences between baseline means using Carlisle's montecarlo anova method
Description
Creates plots of distribution of p-values for differences in baseline means calculated using Carlisle's montecarlo anova method.
Usage
anova_fn(
df = anova_data,
method = "alt",
seed = 0,
sims = -1,
btsp = 500,
title = "",
verbose = TRUE
)
Arguments
df |
dataframe generated from load_clean function |
method |
"orig" is adapted from original code; "alt" avoids using loops in the code (see details) |
seed |
the seed to use for random number generation, default 0 = current date and time. Specify seed to make repeatable. |
sims |
number of simulations, default -1 = function selects based on number of variables and sample size |
btsp |
number of bootstrap repeats used to generate 95% confidence interval around AUC |
title |
optional title for plots |
verbose |
TRUE or FALSE indicates whether progress bar and comments show and prints plot |
Details
Method is from Carlisle JB, Loadsman JA. Evidence for non-random sampling in randomised, controlled trials by Yuhji Saitoh. Anaesthesia. 2017;72:17-27.
R code is in appendix to paper. This function is adapted from that code.
The function has two methods. The published code selects each variable from each study then generates
simulations for that variable using a row-wise approach with several loops. The adapted method is method = "orig".
The method = "alt" generates all the simulations at once and initially I thought was considerably faster, but in practice the time savings are small.
The results from the two approaches will not be identical even if the same random number seed is used
because they use the generated random numbers in different orders but the p-values generated differ by about <0.1. Usually the
differences are close to 0.01 (although this depends on the number of simulations- more simulations = smaller differences).
The code that generates the p-value for each variable from the simulated means is essentially the same.
Returns a list containing 3 objects and (if verbose = TRUE) prints the plot anova_ecdf
Value
list containing 3 objects as described
anova_ecdf = plot of cumulative distribution of calculated p-values compared to the expected uniform distribution
anova_pvalues = plots of distribution of calculated p-values and AUC, as for pval_cont_fn()
anova_all_results = list containing
anova_data = data frame of baseline data, with calculated p-values
anova_pvals = plot of distribution of calculated p-values from anova_pvalues
anova_auc = plot of AUC of calculated p-values from anova_pvalues
Examples
# load example data
anova_data <- load_clean(import= "no", file.cont = "SI_pvals_cont",anova= "yes",
format.cont = "wide")$anova_data
# run function (takes only a few seconds)
anova_fn(seed=10, sims = 100, btsp = 100)$anova_ecdf
# to import an excel spreadsheet (modify using local path,
# file and sheet name, range, and format):
# get path for example files
path <- system.file("extdata", "reappraised_examples.xlsx", package = "reappraised",
mustWork = TRUE)
# delete file name from path
path <- sub("/[^/]+$", "", path)
# load data
anova_data <- load_clean(import= "yes", anova = "yes", dir = path,
file.name.cont = "reappraised_examples.xlsx", sheet.name.cont = "SI_pvals_cont",
range.name.cont = "A:O", format.cont = "wide")$anova_data