R: Repeated-Measures Test (Wilcoxon Signed-Ranks Test)

dfba_wilcoxon {DFBA}

R Documentation

Repeated-Measures Test (Wilcoxon Signed-Ranks Test)

Description

Given two continuous, paired variates Y1 and Y2, computes the sample T_pos and T_neg statistics for the Wilcoxon signed-rank test and provides a Bayesian analysis for the population sign-bias parameter phi_w, which is the population proportion of positive differences.

Usage

dfba_wilcoxon(
  Y1,
  Y2,
  a0 = 1,
  b0 = 1,
  prob_interval = 0.95,
  samples = 30000,
  method = NULL,
  hide_progress = FALSE
)

Arguments

`Y1`	Numeric vector for one continuous variate
`Y2`	Numeric vector for values paired with Y1 variate
`a0`	The first shape parameter for the prior beta distribution for `phi_w`. Must be positive and finite.
`b0`	The second shape parameter for the prior beta distribution for `phi_w`. Must be positive and finite.
`prob_interval`	Desired probability for interval estimates of the sign bias parameter `phi_w` (default is 0.95)
`samples`	When `method = "small"`, the number of desired Monte Carlo samples per candidate value for `phi_w` (default is 30000 per candidate phi)
`method`	(Optional) The method option is either `"small"` or `"large"`. The "small" algorithm is based on a discrete Monte Carlo solution for cases where n is typically less than 20. The `"large"` algorithm is based on beta approximation model for the posterior distribution for the `phi_w` parameter. This approximation is reasonable when n > 19. Regardless of n the user can stipulate either method. When the `method` argument is omitted, the program selects the appropriate procedure.
`hide_progress`	(Optional) If `TRUE`, hide percent progress while Monte Carlo sampling is running when `method = SMALL`. (default is `FALSE`).

Details

The Wilcoxon signed-rank test is the frequentist nonparametric counterpart to the paired t-test. The procedure is based on the rank of the difference scores d = Y1 - Y2. The ranking is initially done on the absolute value of the nonzero d values, and each rank is then multiplied by the sign of the difference. Differences equal to zero are dropped. Since the procedure is based on only ranks of the differences, it is robust with respect to outliers in either the Y1 or Y2 measures. The procedure does not depend on the assumption of a normal distribution for the two continuous variates.

The sample T_pos statistic is the sum of the ranks that have a positive sign, whereas T_neg is the positive sum of the ranks that have a negative value. Given n nonzero d scores, T_pos + T_neg = n(n + 1)/2. Tied ranks are possible, especially when there are Y1 and Y2 values that have low precision. In such cases, the Wilcoxon statistics are rounded to the nearest integer.

The Bayesian analysis is based on a parameter phi_w, which is the population proportion for positive d scores. The default prior for phi_w is a flat beta distribution with shape parameters a0 = b0 =1, but the user can stipulate their preferred beta prior by assigning values for a0 and b0. The prob_interval input, which has a default value of .95, is the value for interval estimates for the phi_w parameter, but the user can alter this value if they prefer.

There are two cases for the Bayesian analysis - one for a small number of pairs and another for when there is a large number of pairs. The method = small sample algorithm uses a discrete approximation where there are 200 candidate values for phi_w, which are .0025 to .9975 in steps of .005. For each candidate value for phi_w, there is a prior and posterior probability. The posterior probability is based on Monte Carlo sampling to approximate the likelihood for obtaining the observed Wilcoxon statistics. That is, for each candidate value for phi_w, thousands of Monte Carlo samples are generated for the signs on the numbers (1,2, ..., n) where each number is multiplied by the sign. The proportion of the samples that result in the observed Wilcoxon statistics is an estimate for the likelihood value for that candidate phi_w. The likelihood values along with the prior result in a discrete posterior distribution for phi_w. The default for the number of Monte Carlo samples per candidate phi_w is the input quantity called samples. The default value for samples is 30000, but this quantity can be altered by the user.

Chechile (2018) empirically found that for large n there was a beta distribution that approximated the quantiles of the discrete, small sample approach. This approximation is reasonably accurate for n > 24, and is used when method = "large".

If the method argument is omitted, the function employs the method that is appropriate given the sample size. Note: the method = "small" algorithm is slower than the algorithm for method = "large"; for cases where n > 24, method = "small" and method = "large" will produce similar estimates but the former method requires increased processing time.

Value

A list containing the following components:

`T_pos`	Sum of the positive ranks in the pairwise comparisons
`T_neg`	Sum of the negative ranks in the pairwise comparisons
`n`	Number of nonzero differences for differences `d = Y1-Y2`
`prob_interval`	User-defined probability for interval estimates for phi_w
`samples`	The number of Monte Carlo samples per candidate phi_w for `method = "small"` (default is 30000)
`method`	A character string that is either `"small"` or `"large"` for the algorithm used (default is NULL)
`a0`	The first shape parameter for the beta prior distribution (default is 1)
`b0`	The second shape parameter for the beta distribution prior (default is 1)
`a_post`	First shape parameter for the posterior beta distribution
`b_post`	Second shape parameter for the posterior beta distribution
`phiv`	The 200 candidate values for phi_w for `method = "small"`
`phipost`	The discrete posterior distribution for phi_w when `method = "small"`
`priorprH1`	The prior probability that phi_w > .5
`prH1`	The posterior probability for phi_w > .5
`BF10`	Bayes factor for the relative increase in the posterior odds for the alternative hypothesis that phi_w > .5 over the null model for phi_w <= .5
`post_mean`	The posterior mean for phi_w
`cumulative_phi`	The posterior cumulative distribution for phi_w when `method = "small"`
`hdi_lower`	The lower limit for the posterior highest-density interval estimate for phi_w
`hdi_upper`	The upper limit for the posterior highest-density interval estimate for phi_w
`a_post`	The first shape parameter for a beta distribution model for phi_w when `method = "large"`
`b_post`	The second shape parameter for a beta distribution model for phi_w when `method = "large"`
`post_median`	The posterior median for phi_w when `method = "large"`
`eti_lower`	The equal-tail lower limit for phi_w
`eti_upper`	The equal-tail upper limit for phi_w

References

Chechile, R.A. (2020). Bayesian Statistics for Experimental Scientists: A General Introduction to Distribution-Free Methods. Cambridge: MIT Press.

Chechile, R. A. (2018) A Bayesian analysis for the Wilcoxon signed-rank statistic. Communications in Statistics - Theory and Methods, https://doi.org/10.1080/03610926.2017.1388402

Examples


## Examples with a small number of pairs



conditionA <- c(1.49, 0.64, 0.96, 2.34, 0.78, 1.29, 0.72, 1.52, 0.62, 1.67,
                1.19, 0.86)
conditionB <- c(0.53, 0.55, 0.58, 0.97, 0.60, 0.22, 0.05, 13.14, 0.63, 0.33,
                0.91, 0.37)

dfba_wilcoxon(Y1 = conditionA,
              Y2 = conditionB,
              samples = 250,
              hide_progress = TRUE)

# Examples with large sample size

E <- c(6.45, 5.65, 4.34, 5.92, 2.84, 13.06, 6.61, 5.47, 4.49, 6.39, 6.63,
       3.55, 3.76, 5.61, 7.45, 6.41, 10.16, 6.26, 8.46, 2.29, 3.16, 5.68,
       4.13, 2.94, 4.87, 4.44, 3.13, 8.87)

C <- c(2.89, 4.19, 3.22, 6.50, 3.10, 4.19, 5.13, 3.77, 2.71, 2.58, 7.59,
       2.68, 4.98, 2.35, 5.15, 8.46, 3.77, 8.83, 4.06, 2.50, 5.48, 2.80,
       8.89, 3.19, 9.36, 4.58, 2.94, 4.75)

BW<-dfba_wilcoxon(Y1 = E,
                  Y2 = C)
BW
plot(BW)

[Package DFBA version 0.1.0 Index]