senU {DOS}R Documentation

Sensitivity Analysis for a New U Statistic


Sensitivity analysis for the new U statistic of Rosenbaum (2011). For m=m1=m2, this is the test of Stephenson (1981). The ranks proposed by Stephenson closely approximate the optimal ranks proposed by Conover and Salzburg (1988) for detecting a treatment that has a large effect on a small subpopulation and no effect on most of the population; see Rosenbaum (2007). The example reproduces some results from Chapter 16 of Design of Observational Studies (2010).


senU(d, gamma = 1, m = 2, m1 = 2, m2 = 2, = FALSE,
     alpha = 0.05, alternative = "greater", exact = NULL)



A vector of treated-minus-control matched pair differences in outcomes.


gamma >= 1 is the value of the sensitivity parameter.


See m2.


See m2.


If (m,m1,m2) are three integers such that 1 <= m1 <= m2 <= m, then the triple (m,m1,m2) defines a U statistic. If (m,m1,m2) = (1,1,1), then the U statistic is the sign test statistic. If (m,m1,m2) = (2,2,2), then it is the U statistic that closely approximates Wilcoxons signed rank test. If m=m1=m2, then the U statistic is the test of Stephenson (1981). The general U statistic is discussed in Rosenbaum (2011).

If, the a 1-alpha confidence interval and an interval of point estimates is returned in addition to the P-value testing no treatment effect.


Coverage rate of the confidence interval. With probability at least 1-alpha, the confidence interval will cover the treatment effect providing the bias in treatment assignment is at most gamma.


If alternative = "greater" or alternative = "less", then one-sided tests and intervals are returned. If alternative = "twosided", then both one sided tests are done, with the smaller P-value doubled to yield a two-sided P-value. If alternative = "twosided", the confidence interval is the intersection of two one-sided 1-alpha/2 confidence intervals.


If exact is NULL, then exact is set to TRUE if length(d) <= 50, and is set to FALSE if length(d) > 50. The ranks used by the U statistic involve combinatorial coefficiencts that grow rapidly with increasing sample size. If exact=TRUE, these ranks are computed exactly using expression (8) in Rosenbaum (2011). If exact=FALSE, the ranks are computed by an asymptotic approximation that does not involve large combinatorial coefficients, specifically expression (9) in Rosenbaum (2012).


The senWilcox function uses a large sample Normal approximation to the distribution of Wilcoxon's signed rank statistic. When gamma=1, it should agree with the wilcox.test() function in the stats package with exact=FALSE and correct=FALSE. The example reproduces the example of the large-sample approximation in Section 3.5 of Design of Observational Studies. Note that the confidence intervals in Table 3.3 of that book are exact, not approximate, so they are slightly different.



The upper bound on the P-value testing no effect in the presence of a bias in treatment assignment of at most gamma. If the bias in treatment assignment is at most gamma, and if there is no treatment effect, then there is at most an alpah chance that the P-value is less than alpha, this being true for all 0<alpha<1.


If, the interval of point estimates of an additive treatment effect in the presence of a bias in treatment assigment of at most gamma. If gamma=1, then you are assuming ignorable treatment assignment or equivalently no unmeasured confounding, so the interval collapses to a point, and that point is the usual Hodges-Lehmann point estimate.

If, the a 1-alpha confidence interval for an additive treatment effect in the presence of a bias in treatment assignment of at most gamma. If gamma=1, then this is the usual confidence interval obtained by inverting the Wilcoxon test, and it would be appropriate in a paired randomized experiment.


The test of Stephenson (1981) uses ranks similar to those of Conover and Salzburg (1988) which were designed to have high power when most people are unaffected by treatment, but a small subpopulation is strongly affected; see Rosenbaum (2007). This is the situation discussed in Chapter 16 of Design of Observational Studies (2010). Even for pair differences d that are Normal with expectation tau and constant variance, the Wilcoxon test tends to exaggerate the degree of sensitivity to unmeasured bias. Compare the Wilcoxon test and the U statistic with (m,m1,m2) = (5,4,5) in the Normal situation. In a randomized experiment (gamma=1), the two tests have the same Pitman efficiency. However, as the number of pairs increases with tau=0.5, the Wilcoxon test has limiting sensitivity to bias of gamma=3.2 while (m,m1,m2) = (5,4,5) has limiting sensitivity 3.9, and (m,m1,m2) = (8,7,8) has limiting sensitivity 5.1. See Rosenbaum (2011) for specifics.


Paul R. Rosenbaum


Conover, W. J. and Salsburg, D. S. (1988). Locally most powerful tests for detecting treatment effects when only a subset of patients can be expected to" respond" to treatment. Biometrics, 189-196.

Hodges Jr, J. L. and Lehmann, E. L. (1963). Estimates of location based on rank tests. The Annals of Mathematical Statistics, 598-611.

Rosenbaum, P. R. (1993). Hodges-Lehmann point estimates of treatment effect in observational studies. Journal of the American Statistical Association, 88(424), 1250-1253.

Rosenbaum, P. R. (2007). Confidence intervals for uncommon but dramatic responses to treatment. Biometrics, 63(4), 1164-1171. <doi:10.1111/j.1541-0420.2007.00783.x> >

Rosenbaum, P. R. (2010). Design of Observational Studies. New York: Springer. The method and example are discussed in Chapter 16.

Rosenbaum, P. R. (2011). A new U statistic with superior design sensitivity in matched observational studies. Biometrics, 67(3), 1017-1027. <doi:10.1111/j.1541-0420.2010.01535.x>

Schoket, B., Phillips, D. H., Hewer, A. and Vincze, I. (1991). 32P-postlabelling detection of aromatic DNA adducts in peripheral blood lymphocytes from aluminium production plant workers. Mutation Research/Genetic Toxicology, 260(1), 89-98.

Stephenson, W. R. (1981). A general class of one-sample nonparametric test statistics based on subsamples. Journal of the American Statistical Association, 76(376), 960-966.



# With the defaults, m=2, m1=2, m2=2, the U-statistic is very
# similar to Wilcoxon's signed rank statistic

# With m=1, m1=1, m2=1, the U-statistic is the sign test

# With m=m1=m2, this is the test of Stephenson (1981) whose ranks are similar to
# those of Conover and Salzburg (1988); see Rosenbaum (2007).

# The calculations that follow reproduce the sensitivity analysis for the
# data of Schoket et al. () in Chapter 16 of Desgin of Observational Studies (2010).




# Reproduces parts of Table 2 in Rosenbaum (2011)

# m=2, m1=2, m2=2 is the U-statistic that closely
# resembles Wilcoxon's signed rank test.  Note
# that the results are almost the same.
senWilcox(lead$dif,gamma=5) # In Table 2

[Package DOS version 1.0.0 Index]