dfba_wilcoxon {DFBA}R Documentation

Repeated-Measures Test (Wilcoxon Signed-Ranks Test)

Description

Given two continuous, paired variates Y1 and Y2, computes the sample T_pos and T_neg statistics for the Wilcoxon signed-rank test and provides a Bayesian analysis for the population sign-bias parameter phi_w, which is the population proportion of positive differences.

Usage

dfba_wilcoxon(
  Y1,
  Y2,
  a0 = 1,
  b0 = 1,
  prob_interval = 0.95,
  samples = 30000,
  method = NULL,
  hide_progress = FALSE
)

Arguments

Y1

Numeric vector for one continuous variate

Y2

Numeric vector for values paired with Y1 variate

a0

The first shape parameter for the prior beta distribution for phi_w. Must be positive and finite.

b0

The second shape parameter for the prior beta distribution for phi_w. Must be positive and finite.

prob_interval

Desired probability for interval estimates of the sign bias parameter phi_w (default is 0.95)

samples

When method = "small", the number of desired Monte Carlo samples per candidate value for phi_w (default is 30000 per candidate phi)

method

(Optional) The method option is either "small" or "large". The "small" algorithm is based on a discrete Monte Carlo solution for cases where n is typically less than 20. The "large" algorithm is based on beta approximation model for the posterior distribution for the phi_w parameter. This approximation is reasonable when n > 19. Regardless of n the user can stipulate either method. When the method argument is omitted, the program selects the appropriate procedure.

hide_progress

(Optional) If TRUE, hide percent progress while Monte Carlo sampling is running when method = SMALL. (default is FALSE).

Details

The Wilcoxon signed-rank test is the frequentist nonparametric counterpart to the paired t-test. The procedure is based on the rank of the difference scores d = Y1 - Y2. The ranking is initially done on the absolute value of the nonzero d values, and each rank is then multiplied by the sign of the difference. Differences equal to zero are dropped. Since the procedure is based on only ranks of the differences, it is robust with respect to outliers in either the Y1 or Y2 measures. The procedure does not depend on the assumption of a normal distribution for the two continuous variates.

The sample T_pos statistic is the sum of the ranks that have a positive sign, whereas T_neg is the positive sum of the ranks that have a negative value. Given n nonzero d scores, T_pos + T_neg = n(n + 1)/2. Tied ranks are possible, especially when there are Y1 and Y2 values that have low precision. In such cases, the Wilcoxon statistics are rounded to the nearest integer.

The Bayesian analysis is based on a parameter phi_w, which is the population proportion for positive d scores. The default prior for phi_w is a flat beta distribution with shape parameters a0 = b0 =1, but the user can stipulate their preferred beta prior by assigning values for a0 and b0. The prob_interval input, which has a default value of .95, is the value for interval estimates for the phi_w parameter, but the user can alter this value if they prefer.

There are two cases for the Bayesian analysis - one for a small number of pairs and another for when there is a large number of pairs. The method = small sample algorithm uses a discrete approximation where there are 200 candidate values for phi_w, which are .0025 to .9975 in steps of .005. For each candidate value for phi_w, there is a prior and posterior probability. The posterior probability is based on Monte Carlo sampling to approximate the likelihood for obtaining the observed Wilcoxon statistics. That is, for each candidate value for phi_w, thousands of Monte Carlo samples are generated for the signs on the numbers (1,2, ..., n) where each number is multiplied by the sign. The proportion of the samples that result in the observed Wilcoxon statistics is an estimate for the likelihood value for that candidate phi_w. The likelihood values along with the prior result in a discrete posterior distribution for phi_w. The default for the number of Monte Carlo samples per candidate phi_w is the input quantity called samples. The default value for samples is 30000, but this quantity can be altered by the user.

Chechile (2018) empirically found that for large n there was a beta distribution that approximated the quantiles of the discrete, small sample approach. This approximation is reasonably accurate for n > 24, and is used when method = "large".

If the method argument is omitted, the function employs the method that is appropriate given the sample size. Note: the method = "small" algorithm is slower than the algorithm for method = "large"; for cases where n > 24, method = "small" and method = "large" will produce similar estimates but the former method requires increased processing time.

Value

A list containing the following components:

T_pos

Sum of the positive ranks in the pairwise comparisons

T_neg

Sum of the negative ranks in the pairwise comparisons

n

Number of nonzero differences for differences d = Y1-Y2

prob_interval

User-defined probability for interval estimates for phi_w

samples

The number of Monte Carlo samples per candidate phi_w for method = "small" (default is 30000)

method

A character string that is either "small" or "large" for the algorithm used (default is NULL)

a0

The first shape parameter for the beta prior distribution (default is 1)

b0

The second shape parameter for the beta distribution prior (default is 1)

a_post

First shape parameter for the posterior beta distribution

b_post

Second shape parameter for the posterior beta distribution

phiv

The 200 candidate values for phi_w for method = "small"

phipost

The discrete posterior distribution for phi_w when method = "small"

priorprH1

The prior probability that phi_w > .5

prH1

The posterior probability for phi_w > .5

BF10

Bayes factor for the relative increase in the posterior odds for the alternative hypothesis that phi_w > .5 over the null model for phi_w <= .5

post_mean

The posterior mean for phi_w

cumulative_phi

The posterior cumulative distribution for phi_w when method = "small"

hdi_lower

The lower limit for the posterior highest-density interval estimate for phi_w

hdi_upper

The upper limit for the posterior highest-density interval estimate for phi_w

a_post

The first shape parameter for a beta distribution model for phi_w when method = "large"

b_post

The second shape parameter for a beta distribution model for phi_w when method = "large"

post_median

The posterior median for phi_w when method = "large"

eti_lower

The equal-tail lower limit for phi_w

eti_upper

The equal-tail upper limit for phi_w

References

Chechile, R.A. (2020). Bayesian Statistics for Experimental Scientists: A General Introduction to Distribution-Free Methods. Cambridge: MIT Press.

Chechile, R. A. (2018) A Bayesian analysis for the Wilcoxon signed-rank statistic. Communications in Statistics - Theory and Methods, https://doi.org/10.1080/03610926.2017.1388402

Examples


## Examples with a small number of pairs



conditionA <- c(1.49, 0.64, 0.96, 2.34, 0.78, 1.29, 0.72, 1.52, 0.62, 1.67,
                1.19, 0.86)
conditionB <- c(0.53, 0.55, 0.58, 0.97, 0.60, 0.22, 0.05, 13.14, 0.63, 0.33,
                0.91, 0.37)

dfba_wilcoxon(Y1 = conditionA,
              Y2 = conditionB,
              samples = 250,
              hide_progress = TRUE)

# Examples with large sample size

E <- c(6.45, 5.65, 4.34, 5.92, 2.84, 13.06, 6.61, 5.47, 4.49, 6.39, 6.63,
       3.55, 3.76, 5.61, 7.45, 6.41, 10.16, 6.26, 8.46, 2.29, 3.16, 5.68,
       4.13, 2.94, 4.87, 4.44, 3.13, 8.87)

C <- c(2.89, 4.19, 3.22, 6.50, 3.10, 4.19, 5.13, 3.77, 2.71, 2.58, 7.59,
       2.68, 4.98, 2.35, 5.15, 8.46, 3.77, 8.83, 4.06, 2.50, 5.48, 2.80,
       8.89, 3.19, 9.36, 4.58, 2.94, 4.75)

BW<-dfba_wilcoxon(Y1 = E,
                  Y2 = C)
BW
plot(BW)



[Package DFBA version 0.1.0 Index]