R: Difference of Two Means and Area Under the Curve

t_neat {neatStats}

R Documentation

Difference of Two Means and Area Under the Curve

Description

Welch's t-test results including Cohen's d with confidence interval (CI), Bayes factor (BF), and area under the receiver operating characteristic curve (AUC). For non-parametric version, Wilcoxon test results (Mann–Whitney U test, aka "Wilcoxon rank-sum test", for independent samples; Wilcoxon signed-rank test for paired samples; including nonparametric "location difference estimate" (see stats::wilcox.test); along with corresponding rank-based BFs as per van Doorn et al., 2020).

Usage

t_neat(
  var1,
  var2,
  pair = FALSE,
  nonparametric = FALSE,
  greater = NULL,
  norm_tests = "latent",
  norm_plots = FALSE,
  ci = NULL,
  bf_added = FALSE,
  bf_rscale = sqrt(0.5),
  bf_sample = 1000,
  auc_added = FALSE,
  cutoff = NULL,
  r_added = TRUE,
  for_table = FALSE,
  test_title = NULL,
  round_descr = 2,
  round_auc = 3,
  auc_greater = "1",
  cv_rep = FALSE,
  cv_fold = 10,
  hush = FALSE,
  plots = FALSE,
  rug_size = 4,
  aspect_ratio = 1,
  y_label = "density estimate",
  x_label = "\nvalues",
  factor_name = NULL,
  var_names = c("1", "2"),
  reverse = FALSE
)

Arguments

`var1`	Numeric vector; numbers of the first variable.
`var2`	Numeric vector; numbers of the second variable.
`pair`	Logical. If `TRUE`, all tests (t, BF, AUC) are conducted for paired samples. If `FALSE` (default) for independent samples.
`nonparametric`	Logical (`FALSE` by default). If `TRUE`, uses nonparametric (rank-based, "Wilcoxon") t-tests (including BFs; see Notes).
`greater`	`NULL` or string (or number); optionally specifies one-sided tests (t and BF): either "1" (`var1` mean expected to be greater than `var2` mean) or "2" (`var2` mean expected to be greater than `var1` mean). If `NULL` (default), the test is two-sided.
`norm_tests`	Normality tests. Any or all of the following character input is accepted (as a single string or a character vector; case-insensitive): `"W"` (Shapiro-Wilk), `"K2"` (D'Agostino), `"A2"` (Anderson-Darling), `"JB"` (Jarque-Bera); see Notes. Two other options are `"all"` (same as `TRUE`; to choose all four previous tests at the same time) or `"latent"` (default value; prints all tests only if `nonparametric` is set to `FALSE` and any of the four tests gives a p value below .05). Each normality test is performed for the difference values between the two variables in case of paired samples, or for each of the two variables for unpaired samples. Set to `"none"` to disable (i.e., not to perform any normality tests).
`norm_plots`	If `TRUE`, displays density, histogram, and Q-Q plots (and scatter plots for paired tests) for each of the two variable (and differences for pairwise observations, in case of paired samples).
`ci`	Numeric; confidence level for returned CIs for Cohen's d and AUC.
`bf_added`	Logical. If `TRUE` (default), Bayes factor is calculated and displayed.
`bf_rscale`	The scale of the prior distribution (`0.707` by default).
`bf_sample`	Number of samples used to estimate Bayes factor (`1000` by default). More samples (e.g. `10000`) take longer time but give more stable BF.
`auc_added`	Logical (`FALSE` by default). If `TRUE`, AUC is calculated and displayed. Includes TPR and TNR, i.e., true positive and true negative rates, i.e. sensitivity and specificity, using an optimal value, i.e. threshold, that provides maximal TPR and TNR. These values may be cross-validated: see `cv_rep`. (Note that what is designated as "positive" or "negative" depends on the scenario: this function always assumes `var1` as positive and `var2` as negative. If your scenario or preference differs, you can simply switch the names or values when reporting the results.)
`cutoff`	Numeric. Custom cutoff value for AUC TPR and TNR, also to be depicted in the plot. In case of multiple given, the first is used for calculations, but all will be depicted in the plot.
`r_added`	Logical. If `TRUE` (default), Pearson correlation is calculated and displayed in case of paired comparison.
`for_table`	Logical. If `TRUE`, omits the confidence level display from the printed text.
`test_title`	`NULL` or string. If not `NULL`, simply displayed in printing preceding the statistics. (Useful e.g. to distinguish several different comparisons inside a `function` or a `for` loop.)
`round_descr`	Number `to round` to the descriptive statistics (means and SDs).
`round_auc`	Number `to round` to the AUC and its CI.
`auc_greater`	String (or number); specifies which variable is expected to have greater values for 'cases' as opposed to 'controls': "1" (default; `var1` expected to be greater for 'cases' than `var2` mean) or "2" (`var2` expected to be greater for 'cases' than `var1`). Not to be confused with one-sided tests; see Details.
`cv_rep`	`FALSE` (default), `TRUE`, or numeric. If `TRUE` or numeric, a cross-validation is performed for the calculation of TPRs and TNRs. Numeric value specifies the number of repetitions, while, if `TRUE`, it defaults to `100` repetitions. In each repetition, the data is divided into `k` random parts ("folds"; see `cv_fold`), and the optimal accuracy is obtained k times from a k-1 training set (`var1` and `var2` truncated to equal length, if needed, in each case within each repetition), and the TPR and TNR are calculated from the remaining test set (different each time).
`cv_fold`	Numeric. The number of folds into which the data is divided for cross-validation (default: 10).
`hush`	Logical. If `TRUE`, prevents printing any details to console.
`plots`	Logical (or `NULL`). If `TRUE`, creates a combined density plot (i.e., `Gaussian kernel density estimates`) from the two variables. Includes dashed vertical lines to indicate means of each of the two variables. If `nonparametric` is set to `TRUE`, medians are calculated for these dashed lines instead of means. When `auc_added` is `TRUE` (and the AUC is at least .5), the best threshold value for classification (maximal differentiation accuracy using Youden's index) is added to the plot as solid vertical line. (In case of multiple best thresholds with identical overall accuracy, all are added.) If `NULL`, same as if `TRUE` except that histogram is added to the background.
`rug_size`	Numeric (`4` by default): size of the rug ticks below the density plot. Set to `0` (zero) to omit rug plotting.
`aspect_ratio`	Aspect ratio of the plots: `1` (`1`/`1`) by default. (Set to `NULL` for dynamic aspect ratio.)
`y_label`	String or `NULL`; the label for the `y` axis. (Default: `"density estimate"`.)
`x_label`	String or `NULL`; the label for the `x` axis. (Default: `"values"`.)
`factor_name`	String or `NULL`; factor legend title. (Default: `NULL`.)
`var_names`	A vector of two strings; the variable names to be displayed in the legend. (Default: `c("1", "2")`.)
`reverse`	Logical. If `TRUE`, reverses the order of variable names displayed in the legend.

Details

The Bayes factor (BF) supporting null hypothesis is denoted as BF01, while that supporting alternative hypothesis is denoted as BF10. When the BF is smaller than 1 (i.e., supports null hypothesis), the reciprocal is calculated (hence, BF10 = BF, but BF01 = 1/BF). When the BF is greater than or equal to 10000, scientific (exponential) form is reported for readability. (The original full BF number is available in the returned named vector as bf.)

For simplicity, Cohen's d is reported for nonparametric tests too: you may however want to consider reporting alternative effect sizes in this case.

The original pROC::auc function, by default, always returns an AUC greater than (or equal to) .5, assuming that the prediction based on values in the expected direction work correctly at least at chance level. This however may be confusing. Consider an example where we measure the heights of persons in a specific small sample and expect that greater height predicts masculine gender. The results are, say, 169, 175, 167, 164 (cm) for one gender, and 176, 182, 179, 165 for the other. If the expectation is correct (the second, greater values are for males), the AUC is .812. However, if in this particular population females are actually taller than males, the AUC is in fact .188. To keep things clear, the t_neat function always makes an assumption about which variable is expected to be greater for correct classification ("1" by default; i.e., var1; to be specified as auc_greater = "2" for var2 to be expected as greater). For this example, if the first (smaller) variables are given as var1 for females, and second (larger), variables are given as var2 for males, we have to specify auc_greater = "2" to indicate the expectation of larger values for males. (Or, easier, just add the expected larger values as var1.)

Value

Prints t-test statistics (including Cohen's d with CI, BF, and AUC, as specified via the corresponding parameters) in APA style. Furthermore, when assigned, returns a list, that contains a named vector 'stats' with the following elements: t (t value), p (p value), d (Cohen's d), bf (Bayes factor), auc (AUC), accuracy (overall accuracy using the optimal classification threshold), and youden (Youden's index: specificity + sensitivity - 1). The latter three are NULL when auc_added is FALSE. When auc_added is TRUE, there are also two or three additional elements of the list. One is 'roc_obj', which is a roc object, to be used e.g. with the roc_neat function. Another is 'best_thresholds', which contains the best threshold value(s) for classification, along with corresponding specificity and sensitivity. The third 'cv_results' contains the results, if any, of the cross-validation of TPRs and TNRs (means per repetition). Finally, if plots is TRUE (or NULL), the plot is displayed as well as returned as a ggplot object, named t_plot.

Note

The Welch's t-test is calculated via stats::t.test.

#'Normality tests are all calculated via fBasics::NormalityTests, selected based on the recommendation of Lakens (2015), quoting Yap and Sim (2011, p. 2153): "If the distribution is symmetric with low kurtosis values (i.e. symmetric short-tailed distribution), then the D'Agostino and Shapiro-Wilkes tests have good power. For symmetric distribution with high sample kurtosis (symmetric long-tailed), the researcher can use the JB, Shapiro-Wilkes, or Anderson-Darling test." See urlhttps://github.com/Lakens/perfect-t-test for more details.

Cohen's d and its confidence interval are calculated, using the t value, via MBESS::ci.smd for independent samples (as standardized mean difference) and via MBESS::ci.sm for paired samples (as standardized mean).

The parametric Bayes factor is calculated via BayesFactor::ttestBF. The nonparametric (rank-based) Bayes factor is a contribution by Johnny van Doorn; the original source code is available via https://osf.io/gny35/.

The correlation and its CI are calculated via stats::cor.test, and is always two-sided, always with 95 percent CI. For more, use corr_neat.

The AUC and its CI are calculated via pROC::auc, and the accuracy at optimal threshold via pROC::coords (x = "best"); both using the object pROC::roc.

References

Delacre, M., Lakens, D., & Leys, C. (2017). Why psychologists should by default use Welch's t-test instead of Student's t-test. International Review of Social Psychology, 30(1). doi:10.5334/irsp.82

Kelley, K. (2007). Methods for the behavioral, educational, and social sciences: An R package. Behavior Research Methods, 39(4), 979-984. doi:10.3758/BF03192993

Lakens, D. (2015). The perfect t-test (version 1.0.0). Retrieved from https://github.com/Lakens/perfect-t-test. doi:10.5281/zenodo.17603

Robin, X., Turck, N., Hainard, A., Tiberti, N., Lisacek, F., Sanchez, J. C., & Muller, M. (2011). pROC: an open-source package for R and S+ to analyze and compare ROC curves. BMC bioinformatics, 12(1), 77. doi:10.1186/1471-2105-12-77

van Doorn, J., Ly, A., Marsman, M., & Wagenmakers, E.-J. (2020). Bayesian rank-based hypothesis testing for the rank sum test, the signed rank test, and Spearman’s rho. Journal of Applied Statistics, 1–23. doi:10.1080/02664763.2019.1709053

Yap, B. W., & Sim, C. H. (2011). Comparisons of various types of normality tests. Journal of Statistical Computation and Simulation, 81(12), 2141–2155. doi:10.1080/00949655.2010.520163

Examples

# assign two variables (numeric vectors)
v1 = c(191, 115, 129, 43, 523,-4, 34, 28, 33,-1, 54)
v2 = c(4,-2, 23, 13, 32, 16, 3, 29, 37,-4, 65)

t_neat(v1, v2) # prints results as independent samples
t_neat(v1, v2, pair = TRUE) # as paired samples (r added by default)
t_neat(v1, v2, pair = TRUE, greater = "1") # one-sided
t_neat(v1, v2, pair = TRUE, auc_added = TRUE ) # AUC included

# print results and assign returned list
results = t_neat(v1, v2, pair = TRUE)

results$stats['bf'] # get precise BF value

[Package neatStats version 1.13.3 Index]