dfba_wilcoxon {DFBA} R Documentation

## Repeated-Measures Test (Wilcoxon Signed-Ranks Test)

### Description

Given two continuous, paired variates Y1 and Y2, computes the sample T_pos and T_neg statistics for the Wilcoxon signed-rank test and provides a Bayesian analysis for the population sign-bias parameter phi_w, which is the population proportion of positive differences.

### Usage

dfba_wilcoxon(
Y1,
Y2,
a0 = 1,
b0 = 1,
prob_interval = 0.95,
samples = 30000,
method = NULL,
hide_progress = FALSE
)


### Arguments

 Y1 Numeric vector for one continuous variate Y2 Numeric vector for values paired with Y1 variate a0 The first shape parameter for the prior beta distribution for phi_w. Must be positive and finite. b0 The second shape parameter for the prior beta distribution for phi_w. Must be positive and finite. prob_interval Desired probability for interval estimates of the sign bias parameter phi_w (default is 0.95) samples When method = "small", the number of desired Monte Carlo samples per candidate value for phi_w (default is 30000 per candidate phi) method (Optional) The method option is either "small" or "large". The "small" algorithm is based on a discrete Monte Carlo solution for cases where n is typically less than 20. The "large" algorithm is based on beta approximation model for the posterior distribution for the phi_w parameter. This approximation is reasonable when n > 19. Regardless of n the user can stipulate either method. When the method argument is omitted, the program selects the appropriate procedure. hide_progress (Optional) If TRUE, hide percent progress while Monte Carlo sampling is running when method = SMALL. (default is FALSE).

### Details

The Wilcoxon signed-rank test is the frequentist nonparametric counterpart to the paired t-test. The procedure is based on the rank of the difference scores d = Y1 - Y2. The ranking is initially done on the absolute value of the nonzero d values, and each rank is then multiplied by the sign of the difference. Differences equal to zero are dropped. Since the procedure is based on only ranks of the differences, it is robust with respect to outliers in either the Y1 or Y2 measures. The procedure does not depend on the assumption of a normal distribution for the two continuous variates.

The sample T_pos statistic is the sum of the ranks that have a positive sign, whereas T_neg is the positive sum of the ranks that have a negative value. Given n nonzero d scores, T_pos + T_neg = n(n + 1)/2. Tied ranks are possible, especially when there are Y1 and Y2 values that have low precision. In such cases, the Wilcoxon statistics are rounded to the nearest integer.

The Bayesian analysis is based on a parameter phi_w, which is the population proportion for positive d scores. The default prior for phi_w is a flat beta distribution with shape parameters a0 = b0 =1, but the user can stipulate their preferred beta prior by assigning values for a0 and b0. The prob_interval input, which has a default value of .95, is the value for interval estimates for the phi_w parameter, but the user can alter this value if they prefer.

There are two cases for the Bayesian analysis - one for a small number of pairs and another for when there is a large number of pairs. The method = small sample algorithm uses a discrete approximation where there are 200 candidate values for phi_w, which are .0025 to .9975 in steps of .005. For each candidate value for phi_w, there is a prior and posterior probability. The posterior probability is based on Monte Carlo sampling to approximate the likelihood for obtaining the observed Wilcoxon statistics. That is, for each candidate value for phi_w, thousands of Monte Carlo samples are generated for the signs on the numbers (1,2, ..., n) where each number is multiplied by the sign. The proportion of the samples that result in the observed Wilcoxon statistics is an estimate for the likelihood value for that candidate phi_w. The likelihood values along with the prior result in a discrete posterior distribution for phi_w. The default for the number of Monte Carlo samples per candidate phi_w is the input quantity called samples. The default value for samples is 30000, but this quantity can be altered by the user.

Chechile (2018) empirically found that for large n there was a beta distribution that approximated the quantiles of the discrete, small sample approach. This approximation is reasonably accurate for n > 24, and is used when method = "large".

If the method argument is omitted, the function employs the method that is appropriate given the sample size. Note: the method = "small" algorithm is slower than the algorithm for method = "large"; for cases where n > 24, method = "small" and method = "large" will produce similar estimates but the former method requires increased processing time.

### Value

A list containing the following components:

 T_pos Sum of the positive ranks in the pairwise comparisons T_neg Sum of the negative ranks in the pairwise comparisons n Number of nonzero differences for differences d = Y1-Y2 prob_interval User-defined probability for interval estimates for phi_w samples The number of Monte Carlo samples per candidate phi_w for method = "small" (default is 30000) method A character string that is either "small" or "large" for the algorithm used (default is NULL) a0 The first shape parameter for the beta prior distribution (default is 1) b0 The second shape parameter for the beta distribution prior (default is 1) a_post First shape parameter for the posterior beta distribution b_post Second shape parameter for the posterior beta distribution phiv The 200 candidate values for phi_w for method = "small" phipost The discrete posterior distribution for phi_w when method = "small" priorprH1 The prior probability that phi_w > .5 prH1 The posterior probability for phi_w > .5 BF10 Bayes factor for the relative increase in the posterior odds for the alternative hypothesis that phi_w > .5 over the null model for phi_w <= .5 post_mean The posterior mean for phi_w cumulative_phi The posterior cumulative distribution for phi_w when method = "small" hdi_lower The lower limit for the posterior highest-density interval estimate for phi_w hdi_upper The upper limit for the posterior highest-density interval estimate for phi_w a_post The first shape parameter for a beta distribution model for phi_w when method = "large" b_post The second shape parameter for a beta distribution model for phi_w when method = "large" post_median The posterior median for phi_w when method = "large" eti_lower The equal-tail lower limit for phi_w eti_upper The equal-tail upper limit for phi_w

### References

Chechile, R.A. (2020). Bayesian Statistics for Experimental Scientists: A General Introduction to Distribution-Free Methods. Cambridge: MIT Press.

Chechile, R. A. (2018) A Bayesian analysis for the Wilcoxon signed-rank statistic. Communications in Statistics - Theory and Methods, https://doi.org/10.1080/03610926.2017.1388402

### Examples


## Examples with a small number of pairs

conditionA <- c(1.49, 0.64, 0.96, 2.34, 0.78, 1.29, 0.72, 1.52, 0.62, 1.67,
1.19, 0.86)
conditionB <- c(0.53, 0.55, 0.58, 0.97, 0.60, 0.22, 0.05, 13.14, 0.63, 0.33,
0.91, 0.37)

dfba_wilcoxon(Y1 = conditionA,
Y2 = conditionB,
samples = 250,
hide_progress = TRUE)

# Examples with large sample size

E <- c(6.45, 5.65, 4.34, 5.92, 2.84, 13.06, 6.61, 5.47, 4.49, 6.39, 6.63,
3.55, 3.76, 5.61, 7.45, 6.41, 10.16, 6.26, 8.46, 2.29, 3.16, 5.68,
4.13, 2.94, 4.87, 4.44, 3.13, 8.87)

C <- c(2.89, 4.19, 3.22, 6.50, 3.10, 4.19, 5.13, 3.77, 2.71, 2.58, 7.59,
2.68, 4.98, 2.35, 5.15, 8.46, 3.77, 8.83, 4.06, 2.50, 5.48, 2.80,
8.89, 3.19, 9.36, 4.58, 2.94, 4.75)

BW<-dfba_wilcoxon(Y1 = E,
Y2 = C)
BW
plot(BW)



[Package DFBA version 0.1.0 Index]