twoSampleLinearRankTestCensored {EnvStats}  R Documentation 
Twosample linear rank test to detect a difference (usually a shift) between two distributions based on censored data.
twoSampleLinearRankTestCensored(x, x.censored, y, y.censored,
censoring.side = "left", location.shift.null = 0, scale.shift.null = 1,
alternative = "two.sided", test = "logrank", variance = "hypergeometric",
surv.est = "prentice", shift.type = "location")
x 
numeric vector of values for the first sample.
Missing ( 
x.censored 
numeric or logical vector indicating which values of 
y 
numeric vector of values for the second sample.
Missing ( 
y.censored 
numeric or logical vector indicating which values of 
censoring.side 
character string indicating on which side the censoring occurs for the data in

location.shift.null 
numeric scalar indicating the hypothesized value of 
scale.shift.null 
numeric scalar indicating the hypothesized value of 
alternative 
character string indicating the kind of alternative hypothesis. The possible values
are 
test 
character string indicating which linear rank test to use. The possible values are:

variance 
character string indicating which kind of variance to compute for the test. The
possible values are: 
surv.est 
character string indicating what method to use to estimate the survival function.
The possible values are 
shift.type 
character string indicating which kind of shift is being tested. The possible values
are 
The function twoSampleLinearRankTestCensored
allows you to compare two
samples containing censored observations using a linear rank test to determine
whether the two samples came from the same distribution. The help file for
twoSampleLinearRankTest
explains linear rank tests for complete data
(i.e., no censored observations are present), and here we assume you are
familiar with that material The sections below explain how linear
rank tests can be extended to the case of censored data.
Notation
Several authors have proposed extensions of the MWW test to the case of censored
data, mainly in the context of survival analysis (e.g., Breslow, 1970; Cox, 1972;
Gehan, 1965; Mantel, 1966; Peto and Peto, 1972; Prentice, 1978). Prentice (1978)
showed how all of these proposed tests are extensions of a linear rank test to the
case of censored observations.
Survival analysis usually deals with rightcensored data, whereas environmental data is rarely rightcensored but often leftcensored (some observations are reported as less than some detection limit). Fortunately, all of the methods developed for rightcensored data can be applied to leftcensored data as well. (See the subsection LeftCensored Data below.)
In order to explain Prentice's (1978) generalization of linear rank tests to censored
data, we will use the following notation that closely follows Prentice (1978),
Prentice and Marek (1979), and Latta (1981).
Let X
denote a random variable representing measurements from group 1 with
cumulative distribution function (cdf):
F_1(t) = Pr(X \le t) \;\;\;\;\;\; (1)
and let x_1, x_2, \ldots, x_m
denote m
independent observations from this
distribution. Let Y
denote a random variable from group 2 with cdf:
F_2(t) = Pr(Y \le t) \;\;\;\;\;\; (2)
and let y_1, y_2, \ldots, y_n
denote n
independent observations from this
distribution. Set N = m + n
, the total number of observations.
Assume the data are rightcensored so that some observations are only recorded as
greater than some censoring level, with possibly several different censoring levels.
Let t_1, t_2, \ldots, t_k
denote the k
ordered, unique, uncensored
observations for the combined samples (in the context of survival data, t
usually stands
for “time of death”). For i = 1, 2, \ldots, k
, let d_{1i}
denote the number of observations from sample 1 (the X
observations) that are
equal to t_i
, and let d_{2i}
denote the observations from sample 2 (the
Y
observations) equal to this value. Set
d_i = d_{1i} + d_{2i} \;\;\;\;\;\; (3)
the total number of observations equal to t_i
. If there are no tied
uncensored observations, then t_i = 1
for i = 1, 2, \ldots, k
,
otherwise it is greater than 1 for at least on value of i
.
For i = 1, 2, \ldots, k
, let e_{1i}
denote the number of censored
observations from sample 1 (the X
observations) with censoring levels that
fall into the interval [t_i, t_{i+1})
where t_{k+1} = \infty
by
definition, and let e_{2i}
denote the number of censored observations from
sample 2 (the Y
observations) with censoring levels that fall into this
interval. Set
e_i = e_{1i} + e_{2i} \;\;\;\;\;\; (4)
the total number of censoring levels that fall into this interval.
Finally, set n_{1i}
equal to the number of observations from sample 1
(uncensored and censored) known to be greater than or equal to t_i
, i.e.,
that lie in the interval [t_i, \infty)
,
set n_{2i}
equal to the number of observations from sample 2
(uncensored and censored) that lie in this interval, and set
n_i = n_{1i} + n_{2i} \;\;\;\;\;\; (5)
In survival analysis jargon, n_{1i}
denotes the number of people from
sample 1 who are “at risk” at time t_i
, that is, these people are
known to still be alive at this time. Similarly, n_{2i}
denotes the number
of people from sample 2 who are at risk at time t_i
, and n_i
denotes
the total number of people at risk at time t_i
.
Score Statistics for Multiply Censored Data
Prentice's (1978) generalization of the twosample score (linear rank) statistic is
given by:
\nu = \sum_{i=1}^k (d_{1i} c_i + e_{1i} C_i) \;\;\;\;\;\; (6)
where c_i
and C_i
denote the scores associated with the uncensored and
censored observations, respectively. As for complete data, the form of the scores
depends upon the assumed underlying distribution. Table 1 below shows scores for
various assumed distributions as presented in Prentice (1978) and Latta (1981)
(also see Table 5 of Millard and Deverel, 1988, p.2091).
The last column shows what these tests reduce to in the case of complete data
(no censored observations).
Table 1. Scores Associated with Various Censored Data Rank Tests
Uncensored  Censored  Uncensored  
Distribution  Score (c_i )  Score (C_i )  Test Name  Analogue 
Logistic  2\hat{F}_i  1  \hat{F}_i  PetoPeto  Wilcoxon Rank Sum 
"  i  n_i  i  Gehan or Breslow  " 
"  i  \sqrt{n_i}  i  TaroneWare  " 
Normal,  \Phi^{1}(\hat{F}_i)  \phi(c_i)/\hat{S}_i  Normal Scores 1  Normal Scores 
Lognormal  
"  "  \frac{n_i C_{i1}  c_i}{n_i1}  Normal Scores 2  " 
Double  sign(\hat{F}_i  0.5)  \frac{\hat{F}_i}{1\hat{F}_i}, if \hat{F_i} < 0.5  Generalized  Mood's Median 
Exponential  1, if \hat{F}_i \ge 0.5  Sign  
Exponential,  log(\tilde{S}_i)  1  log(\tilde{S}_i)  Logrank  Savage Scores 
Extreme Value 
In Table 1 above, \Phi
denotes the cumulative distribution function of the
standard normal distribution, \phi
denotes the probability density function
of the standard normal distribution, and sign
denotes the sign
function. Also, the quantities \hat{F}_i
and \hat{S}_i
denote the estimates of the cumulative distribution function (cdf) and survival
function, respectively, at time t_i
for the combined sample. The estimated
cdf is related to the estimated survival function by:
\hat{F}_i = 1  \hat{S}_i \;\;\;\;\;\; (7)
The quantity \tilde{S}_i
denotes the Altshuler (1970) estimate of the
survival function at time t_i
for the combined sample (see below).
The argument surv.est
determines what method to use estimate the survival
function. When surv.est="prentice"
(the default), the survival function is
estimated as:
\hat{S}_{i,P} = \prod_{j=1}^{i} \frac{n_j  d_j + 1}{n_j + 1} \;\;\;\;\;\; (8)
(Prentice, 1978). When surv.est="kaplanmeier"
, the survival function is
estimated as:
\hat{S}_{i,KM} = \prod_{j=1}^{i} \frac{n_j  d_j}{n_j} \;\;\;\;\;\; (9)
(Kaplan and Meier, 1958), and when surv.est="petopeto"
, the survival
function is estimated as:
\hat{S}_{i,PP} = \frac{1}{2} (\hat{S}_{i,KM} + \hat{S}_{i1, KM}) \;\;\;\;\;\; (10)
where \hat{S}_{0,KM} = 0
(Peto and Peto, 1972). All three of these estimators
of the survival function should produce very similar results. When
surv.est="altshuler"
, the survival function is estimated as:
\tilde{S}_i = exp(\sum_{j=1}^i \frac{d_j}{n_j}) \;\;\;\;\;\; (11)
(Altshuler, 1970). The scores for the logrank test use this estimator of survival.
Lee and Wang (2003, p. 116) present a slightly different version of the PetoPeto
test. They use the PetoPeto estimate of the survival function for c_i
, but
use the KaplanMeier estimate of the survival function for C_i
.
The scores for the “Normal Scores 1” test shown in Table 1 above are based on
the approximation (30) of Prentice (1978). The scores for the
“Normal Scores 2” test are based on equation (7) of Prentice and Marek (1979).
For the “Normal Scores 2” test, the following rules are used to construct the
scores for the censored observations: C_0 = 0
, and
C_k = 0
if n_k = 1
.
The Distribution of the Score Statistic
Under the null hypothesis that the two distributions are the same, the expected
value of the score statistic \nu
in Equation (6) is 0. The variance of
\nu
can be computed in at least three different ways. If the censoring
mechanism is the same for both groups, the permutation variance is
appropriate (variance="permutation"
) and its estimate is given by:
\hat{\sigma}_{\nu}^2 = \frac{mn}{N(N1)} \sum_{i=1}^k (d_i c_i^2 + e_i C_i^2) \;\;\;\;\;\; (12)
Often, however, it is not clear whether this assumption is valid, and both Prentice (1978) and Prentice and Marek (1979) caution against using the permuation variance (Prentice and Marek, 1979, state it can lead to inflated estimates of variance).
If the censoring mechanisms for the two groups are not necessarily the same, a more
general estimator of the variance is based on a conditional permutation approach. In
this case, the statistic \nu
in Equation (6) is rewritten as:
\nu = \sum_{i=1}^k w_i [d_{1i}  d_i \frac{n_{1i}}{n_i}] \;\;\;\;\;\; (13)
where
w_i = c_i  C_i \;\;\;\;\;\; (14)
c_i
and C_i
are given above in Table 1,
and the conditional permutation or hypergeometric estimate
(variance="hypergeometric"
) is given by:
\hat{\sigma}_{\nu}^2 = \sum_{i=1}^k d_i w_i^2 (\frac{n_{1i}}{n_i}) (1  \frac{n_{1i}}{n_i}) (\frac{n_i  d_i}{n_i  1}) \;\;\;\;\;\; (15)
(Prentice and Marek, 1979; Latta, 1981; Millard and Deverel, 1988). Note that Equation (13) can be thought of as the sum of weighed values of observed minus expected observations.
Prentice (1978) derived an asymptotic estimator of the variance of the score
statistic \nu
given in Equation (6) above based on the log likelihood of the
rank vector (variance="asymptotic"
). This estimator is the same as the
hypergeometric variance estimator for the logrank and Gehan tests (assuming no
tied uncensored observations), but for the PetoPeto test, this estimator is
given by:
\hat{\sigma}_{\nu}^2 = \sum_{i=1}^k \{ \hat{S}_i (1  a_i) b_i  (a_i  \hat{S}_i) b_i [\hat{S}_i b_i + 2 \sum_{j=i+1}^k \hat{S}_j b_j ]\} \;\;\;\;\;\; (16)
where
a_i = \prod_{j=1}^i \frac{n_j + 1}{n_j + 2} \;\;\;\;\;\; (17)
b_i = 2 d_{1i} + e_{1i} \;\;\;\;\;\; (18)
(Prentice, 1978; Latta, 1981; Millard and Deverel, 1988). Note that equation (14)
of Millard and Deverel (1988) contains a typographical error.
The Treatment of Ties
If the hypergeometric estimator of variance is being used, no modifications need to
be made for ties; Equations (13)(15) already account for ties. For the case of the
permutation or asymptotic variance estimators, Equations (6), (12), and (16) all
assume no ties in the uncensored observations. If ties exist in the uncensored
observations, Prentice (1978) suggests computing the scores shown in Table 1
above as if there were no ties, and then assigning average scores to the
tied observations. (This modification also applies to the quantities
a_i
and \hat{S}_i
in Equation (16) above.) For this algorithm, the
statistic in Equation (6) is not in general the same as the one in Equation (13).
Computing a Test Statistic
Under the null hypothesis that the two distributions are the same, the statistic
z = \frac{\nu}{\hat{\sigma}_{\nu}} \;\;\;\;\;\; (19)
is approximately distributed as a standard normal random variable for “large”
sample sizes. This statistic will tend to be large if the observations in
group 1 (the X
observations) tend to be larger than the observations in
group 2 (the Y
observations).
LeftCensored Data
Most of the time, if censored observations occur in environmental data, they are
leftcensored (e.g., observations are reported as less than one or more detection
limits). For the twosample test of differences between groups, the methods that
apply to rightcensored data are easily adapted to leftcensored data: simply
multiply the observations by 1, compute the zstatistic shown in Equation
(20), then reverse the sign of this statistic before computing the pvalue.
a list of class "htestCensored"
containing the results of the hypothesis test.
See the help file for htestCensored.object
for details.
All of the tests computed by twoSampleLinearRankTestCensored
(logrank, TaroneWare, Gehan, PetoPeto, normal scores, and generalized sign)
are based on a
statistic that is essentially the sum over all uncensored time points of the
weighted difference between the observed and expected number of observations at each
time point (see Equation (15) above). The tests differ in how they weight the
differences between the observed and expected number of observations.
Prentice and Marek (1979) point out that the Gehan test uses weights that depend on the censoring rates within each group and can lead to nonsignificant outcomes in the case of heavy censoring when in fact a very large difference between the two groups exists.
Latta (1981) performed a Monte Carlo simulation to study the power of the Gehan,
logrank, and PetoPeto tests using all three different estimators of variance
(permutation, hypergeometric, and asymptotic). He used lognormal, Weibull, and
exponential distributions to generate the observations, and studied two different
cases of censoring: uniform censoring for both samples vs. no censoring in the first
sample and uniform censoring in the second sample. Latta (1981) used sample sizes
of 10 and 50 (both the equal and unequal cases were studied). Latta (1981) found
that all three tests maintained the nominal Type I error level (\alpha
level)
in the case of equal sample sizes and equal censoring. Also, the PetoPeto test
based on the asymptotic variance appeared to maintain the nominal \alpha
level
in all situations, but the other tests were slightly biased in the case of unequal
sample sizes and/or unequal censoring. In particular, tests based on the
hypergeometric variance are slightly biased for unequal sample sizes. Latta (1981)
concludes that if there is no censoring or light censoring, any of the tests may be
used (but the hypergeometric variance should not be used if the sample sizes are
very different). In the case of heavy censoring where sample sizes are far apart
and/or the censoring is very different between samples, the PetoPeto test based on
the asymptotic variance should be used.
Millard and Deverel (1988) also performed a Monte Carlo simulation similar to
Latta's (1981) study. They only used the lognormal distribution to generate
observations, but also looked at the normal scores test and two adhoc modifications
of the MWW test. They found the “Normal Scores 2” test shown in Table 1
above to be the best behaved test in terms of maintaining the nominal
\alpha
level, but the other tests behaved almost as well. As Latta (1981)
found, when sample sizes and censoring are very different between the two groups,
the nominal \alpha
level of most of the tests is slightly biased. In the
cases where the nominal \alpha
level was maintained, the PetoPeto test based
on the asymptotic variance appeared to be as powerful or more powerful than the
normal scores tests.
Neither of the Monte Carlo studies performed by Latta (1981) and Millard and Deverel (1988) looked at the behavior of the twosample linear rank tests in the presence of several tied uncensored observations (because both studies generated observations from continuous distributions). Note that the results shown in Table 9 of Millard and Deverel (1988, p.2097) are not all correct because they did not allow for tied uncensored values. The last example in the EXAMPLES section below shows the correct values that should appear in that table.
Heller and Venkatraman (1996) performed a Monte Carlo simulation study to compare the behaviors of the PetoPeto test (using the Prentice, 1978, estimator of survival; they call this the PrenticeWilcoxon test) and logrank test under varying censoring conditions with sample sizes of 20 and 50 per group based on using the following methods to compute pvalues: the asymptotic standard normal approximation, a permutation test approach (this is NOT the same as the permutation variance), and a bootstrap approach. Observed times were generated from Weibull and lognormal survival time distributions with independent uniform censoring. They found that for the PetoPeto test, "the asymptotic test procedure was the most accurate; resampling procedures did not improve upon its accuracy." For the logrank test, with sample sizes of 20 per group, the usual test based on the asymptotic standard normal approximation tended to have a very slightly higher Type I error rate than assumed (however, for an assumed Type I error rate of 0.05, the largest Type I error rate observed was less than 0.065), whereas the permuation and bootstrap tests performed better; with sample sizes of 50 per group there was no difference in test performance.
Fleming and Harrington (1981) introduced a family of tests (sometimes called Grho
tests) that contain the logrank and PetoPeto tests as special cases. A single
parameter \rho
(rho) controls the weights given to the uncensored and
censored observations. Positive values of \rho
produce tests more sensitive
to early differences in the survival function, that is, differences in the cdf at
small values. Negative values of \rho
produce tests more sensitive to late
differences in the survival function, that is, differences in the cdf at large
values.
The function survdiff
in the R package
survival implements the Grho family of tests suggested by Flemming and
Harrington (1981). Calling survdiff
with rho=0
(the default) yields
the logrank test. Calling survdiff
with rho=1
yields the PetoPeto
test based on the KaplanMeier estimate of survival. The function survdiff
always uses the hypergeometric estimate of variance and the KaplanMeier estimate of
survival, but it uses the “leftcontinuous” version of the KaplanMeier
estimate. The leftcontinuous KM estimate of survival is defined as
follows: at each death (unique uncensored observation), the estimated survival is
equal to the estimated survival based on the ordinary KM estimate at the prior
death time (or 1 for the first death).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Altshuler, B. (1970). Theory for the Measurement of Competing Risks in Animal Experiments. Mathematical Biosciences 6, 1–11.
Breslow, N.E. (1970). A Generalized KruskalWallis Test for Comparing K Samples Subject to Unequal Patterns of Censorship. Biometrika 57, 579–594.
Conover, W.J. (1980). Practical Nonparametric Statistics. Second Edition. John Wiley and Sons, New York, Chapter 4.
Cox, D.R. (1972). Regression Models and Life Tables (with Discussion). Journal of the Royal Statistical Society of London, Series B 34, 187–220.
Divine, G., H.J. Norton, R. Hunt, and J. Dinemann. (2013). A Review of Analysis and Sample Size Calculation Considerations for Wilcoxon Tests. Anesthesia & Analgesia 117, 699–710.
Fleming, T.R., and D.P. Harrington. (1981). A Class of Hypothesis Tests for One and Two Sample Censored Survival Data. Communications in Statistics – Theory and Methods A10(8), 763–794.
Fleming, T.R., and D.P. Harrington. (1991). Counting Processes & Survival Analysis. John Wiley and Sons, New York, Chapter 7.
Gehan, E.A. (1965). A Generalized Wilcoxon Test for Comparing Arbitrarily SinglyCensored Samples. Biometrika 52, 203–223.
Harrington, D.P., and T.R. Fleming. (1982). A Class of Rank Test Procedures for Censored Survival Data. Biometrika 69(3), 553–566.
Heller, G., and E. S. Venkatraman. (1996). Resampling Procedures to Compare Two Survival Distributions in the Presence of RightCensored Data. Biometrics 52, 1204–1213.
Hettmansperger, T.P. (1984). Statistical Inference Based on Ranks. John Wiley and Sons, New York, 323pp.
Hollander, M., and D.A. Wolfe. (1999). Nonparametric Statistical Methods, Second Edition. John Wiley and Sons, New York.
Kaplan, E.L., and P. Meier. (1958). Nonparametric Estimation From Incomplete Observations. Journal of the American Statistical Association 53, 457–481.
Latta, R.B. (1981). A Monte Carlo Study of Some TwoSample Rank Tests with Censored Data. Journal of the American Statistical Association 76(375), 713–719.
Mantel, N. (1966). Evaluation of Survival Data and Two New Rank Order Statistics Arising in its Consideration. Cancer Chemotherapy Reports 50, 163170.
Millard, S.P., and S.J. Deverel. (1988). Nonparametric Statistical Methods for Comparing Two Sites Based on Data With Multiple Nondetect Limits. Water Resources Research, 24(12), 2087–2098.
Millard, S.P., and N.K. Neerchal. (2001). Environmental Statistics with SPLUS. CRC Press, Boca Raton, FL, pp.432–435.
Peto, R., and J. Peto. (1972). Asymptotically Efficient Rank Invariant Test Procedures (with Discussion). Journal of the Royal Statistical Society of London, Series A 135, 185–206.
Prentice, R.L. (1978). Linear Rank Tests with Right Censored Data. Biometrika 65, 167–179.
Prentice, R.L. (1985). Linear Rank Tests. In Kotz, S., and N.L. Johnson, eds. Encyclopedia of Statistical Science. John Wiley and Sons, New York. Volume 5, pp.51–58.
Prentice, R.L., and P. Marek. (1979). A Qualitative Discrepancy Between Censored Data Rank Tests. Biometrics 35, 861–867.
Tarone, R.E., and J. Ware. (1977). On DistributionFree Tests for Equality of Survival Distributions. Biometrika 64(1), 156–160.
USEPA. (2009). Statistical Analysis of Groundwater Monitoring Data at RCRA Facilities, Unified Guidance. EPA 530/R09007, March 2009. Office of Resource Conservation and Recovery Program Implementation and Information Division. U.S. Environmental Protection Agency, Washington, D.C.
USEPA. (2010). Errata Sheet  March 2009 Unified Guidance. EPA 530/R09007a, August 9, 2010. Office of Resource Conservation and Recovery, Program Information and Implementation Division. U.S. Environmental Protection Agency, Washington, D.C.
twoSampleLinearRankTest
, survdiff
,
wilcox.test
, htestCensored.object
.
# The last part of the EXAMPLES section in the help file for
# cdfCompareCensored compares the empirical distribution of copper and zinc
# between two sites: Alluvial Fan and BasinTrough (Millard and Deverel, 1988).
# The data for this example are stored in Millard.Deverel.88.df. Perform a
# test to determine if there is a significant difference between these two
# sites (perform a separate test for the copper and the zinc).
Millard.Deverel.88.df
# Cu.orig Cu Cu.censored Zn.orig Zn Zn.censored Zone Location
#1 < 1 1 TRUE <10 10 TRUE Alluvial.Fan 1
#2 < 1 1 TRUE 9 9 FALSE Alluvial.Fan 2
#3 3 3 FALSE NA NA FALSE Alluvial.Fan 3
#.
#.
#.
#116 5 5 FALSE 50 50 FALSE Basin.Trough 48
#117 14 14 FALSE 90 90 FALSE Basin.Trough 49
#118 4 4 FALSE 20 20 FALSE Basin.Trough 50
#
# First look at the copper data
#
Cu.AF < with(Millard.Deverel.88.df,
Cu[Zone == "Alluvial.Fan"])
Cu.AF.cen < with(Millard.Deverel.88.df,
Cu.censored[Zone == "Alluvial.Fan"])
Cu.BT < with(Millard.Deverel.88.df,
Cu[Zone == "Basin.Trough"])
Cu.BT.cen < with(Millard.Deverel.88.df,
Cu.censored[Zone == "Basin.Trough"])
# Note the large number of tied observations in the copper data
#
table(Cu.AF[!Cu.AF.cen])
# 1 2 3 4 5 7 8 9 10 11 12 16 20
# 5 21 6 3 3 3 1 1 1 1 1 1 1
table(Cu.BT[!Cu.BT.cen])
# 1 2 3 4 5 6 8 9 12 14 15 17 23
# 7 4 8 5 1 2 1 2 1 1 1 1 1
# Logrank test with hypergeometric variance:
#
twoSampleLinearRankTestCensored(x = Cu.AF, x.censored = Cu.AF.cen,
y = Cu.BT, y.censored = Cu.BT.cen)
#Results of Hypothesis Test
#Based on Censored Data
#
#
#Null Hypothesis: Fy(t) = Fx(t)
#
#Alternative Hypothesis: Fy(t) != Fx(t) for at least one t
#
#Test Name: TwoSample Linear Rank Test:
# Logrank Test
# with Hypergeometric Variance
#
#Censoring Side: left
#
#Censoring Level(s): x = 1 5 10 20
# y = 1 2 5 10 15
#
#Data: x = Cu.AF
# y = Cu.BT
#
#Censoring Variable: x = Cu.AF.cen
# y = Cu.BT.cen
#
#Number NA/NaN/Inf's Removed: x = 3
# y = 1
#
#Sample Sizes: nx = 65
# ny = 49
#
#Percent Censored: x = 26.2%
# y = 28.6%
#
#Test Statistics: nu = 1.8791355
# var.nu = 13.6533490
# z = 0.5085557
#
#Pvalue: 0.6110637
# Compare the pvalues produced by the Normal Scores 2 test
# using the hypergeomtric vs. permutation variance estimates.
# Note how much larger the estimated variance is based on
# the permuation variance estimate:
#
twoSampleLinearRankTestCensored(x = Cu.AF, x.censored = Cu.AF.cen,
y = Cu.BT, y.censored = Cu.BT.cen,
test = "normal.scores.2")$p.value
#[1] 0.2008913
twoSampleLinearRankTestCensored(x = Cu.AF, x.censored = Cu.AF.cen,
y = Cu.BT, y.censored = Cu.BT.cen,
test = "normal.scores.2", variance = "permutation")$p.value
#[1] [1] 0.657001
#
# Now look at the zinc data
#
Zn.AF < with(Millard.Deverel.88.df,
Zn[Zone == "Alluvial.Fan"])
Zn.AF.cen < with(Millard.Deverel.88.df,
Zn.censored[Zone == "Alluvial.Fan"])
Zn.BT < with(Millard.Deverel.88.df,
Zn[Zone == "Basin.Trough"])
Zn.BT.cen < with(Millard.Deverel.88.df,
Zn.censored[Zone == "Basin.Trough"])
# Note the moderate number of tied observations in the zinc data,
# and the "outlier" of 620 in the Alluvial Fan data.
#
table(Zn.AF[!Zn.AF.cen])
# 5 7 8 9 10 11 12 17 18 19 20 23 29 30 33 40 50 620
# 1 1 1 1 20 2 1 1 1 1 14 1 1 1 1 1 1 1
table(Zn.BT[!Zn.BT.cen])
# 3 4 5 6 8 10 11 12 13 14 15 17 20 25 30 40 50 60 70 90
# 2 2 2 1 1 5 1 2 1 1 1 2 11 1 4 3 2 2 1 1
# Logrank test with hypergeometric variance:
#
twoSampleLinearRankTestCensored(x = Zn.AF, x.censored = Zn.AF.cen,
y = Zn.BT, y.censored = Zn.BT.cen)
#Results of Hypothesis Test
#Based on Censored Data
#
#
#Null Hypothesis: Fy(t) = Fx(t)
#
#Alternative Hypothesis: Fy(t) != Fx(t) for at least one t
#
#Test Name: TwoSample Linear Rank Test:
# Logrank Test
# with Hypergeometric Variance
#
#Censoring Side: left
#
#Censoring Level(s): x = 3 10
# y = 3 10
#
#Data: x = Zn.AF
# y = Zn.BT
#
#Censoring Variable: x = Zn.AF.cen
# y = Zn.BT.cen
#
#Number NA/NaN/Inf's Removed: x = 1
# y = 0
#
#Sample Sizes: nx = 67
# ny = 50
#
#Percent Censored: x = 23.9%
# y = 8.0%
#
#Test Statistics: nu = 6.992999
# var.nu = 17.203227
# z = 1.686004
#
#Pvalue: 0.09179512
#
# Compare the pvalues produced by the Logrank, Gehan, PetoPeto,
# and TaroneWare tests using the hypergeometric variance.
#
twoSampleLinearRankTestCensored(x = Zn.AF, x.censored = Zn.AF.cen,
y = Zn.BT, y.censored = Zn.BT.cen,
test = "logrank")$p.value
#[1] 0.09179512
twoSampleLinearRankTestCensored(x = Zn.AF, x.censored = Zn.AF.cen,
y = Zn.BT, y.censored = Zn.BT.cen,
test = "gehan")$p.value
#[1] 0.0185445
twoSampleLinearRankTestCensored(x = Zn.AF, x.censored = Zn.AF.cen,
y = Zn.BT, y.censored = Zn.BT.cen,
test = "petopeto")$p.value
#[1] 0.009704529
twoSampleLinearRankTestCensored(x = Zn.AF, x.censored = Zn.AF.cen,
y = Zn.BT, y.censored = Zn.BT.cen,
test = "taroneware")$p.value
#[1] 0.03457803
#
# Clean up
#
rm(Cu.AF, Cu.AF.cen, Cu.BT, Cu.BT.cen,
Zn.AF, Zn.AF.cen, Zn.BT, Zn.BT.cen)
#==========
# Example 16.5 on pages 1622 to 16.23 of USEPA (2009) shows how to perform
# the TaroneWare two sample linear rank test based on censored data using
# observations on tetrachloroethylene (PCE) (ppb) collected at one background
# and one compliance well. The data for this example are stored in
# EPA.09.Ex.16.5.PCE.df.
EPA.09.Ex.16.5.PCE.df
# Well.type PCE.Orig.ppb PCE.ppb Censored
#1 Background <4 4.0 TRUE
#2 Background 1.5 1.5 FALSE
#3 Background <2 2.0 TRUE
#4 Background 8.7 8.7 FALSE
#5 Background 5.1 5.1 FALSE
#6 Background <5 5.0 TRUE
#7 Compliance 6.4 6.4 FALSE
#8 Compliance 10.9 10.9 FALSE
#9 Compliance 7 7.0 FALSE
#10 Compliance 14.3 14.3 FALSE
#11 Compliance 1.9 1.9 FALSE
#12 Compliance 10 10.0 FALSE
#13 Compliance 6.8 6.8 FALSE
#14 Compliance <5 5.0 TRUE
with(EPA.09.Ex.16.5.PCE.df,
twoSampleLinearRankTestCensored(
x = PCE.ppb[Well.type == "Compliance"],
x.censored = Censored[Well.type == "Compliance"],
y = PCE.ppb[Well.type == "Background"],
y.censored = Censored[Well.type == "Background"],
test = "taroneware", alternative = "greater"))
#Results of Hypothesis Test
#Based on Censored Data
#
#
#Null Hypothesis: Fy(t) = Fx(t)
#
#Alternative Hypothesis: Fy(t) > Fx(t) for at least one t
#
#Test Name: TwoSample Linear Rank Test:
# TaroneWare Test
# with Hypergeometric Variance
#
#Censoring Side: left
#
#Censoring Level(s): x = 5
# y = 2 4 5
#
#Data: x = PCE.ppb[Well.type == "Compliance"]
# y = PCE.ppb[Well.type == "Background"]
#
#Censoring Variable: x = Censored[Well.type == "Compliance"]
# y = Censored[Well.type == "Background"]
#
#Sample Sizes: nx = 8
# ny = 6
#
#Percent Censored: x = 12.5%
# y = 50.0%
#
#Test Statistics: nu = 8.458912
# var.nu = 20.912407
# z = 1.849748
#
#Pvalue: 0.03217495
# Compare the pvalue for the TaroneWare test with pvalues from
# the logrank, Gehan, and PetoPeto tests
#
with(EPA.09.Ex.16.5.PCE.df,
twoSampleLinearRankTestCensored(
x = PCE.ppb[Well.type == "Compliance"],
x.censored = Censored[Well.type == "Compliance"],
y = PCE.ppb[Well.type == "Background"],
y.censored = Censored[Well.type == "Background"],
test = "taroneware", alternative = "greater"))$p.value
#[1] 0.03217495
with(EPA.09.Ex.16.5.PCE.df,
twoSampleLinearRankTestCensored(
x = PCE.ppb[Well.type == "Compliance"],
x.censored = Censored[Well.type == "Compliance"],
y = PCE.ppb[Well.type == "Background"],
y.censored = Censored[Well.type == "Background"],
test = "logrank", alternative = "greater"))$p.value
#[1] 0.02752793
with(EPA.09.Ex.16.5.PCE.df,
twoSampleLinearRankTestCensored(
x = PCE.ppb[Well.type == "Compliance"],
x.censored = Censored[Well.type == "Compliance"],
y = PCE.ppb[Well.type == "Background"],
y.censored = Censored[Well.type == "Background"],
test = "gehan", alternative = "greater"))$p.value
#[1] 0.03656224
with(EPA.09.Ex.16.5.PCE.df,
twoSampleLinearRankTestCensored(
x = PCE.ppb[Well.type == "Compliance"],
x.censored = Censored[Well.type == "Compliance"],
y = PCE.ppb[Well.type == "Background"],
y.censored = Censored[Well.type == "Background"],
test = "petopeto", alternative = "greater"))$p.value
#[1] 0.03127296
#==========
# The results shown in Table 9 of Millard and Deverel (1988, p.2097) are correct
# only for the hypergeometric variance and the modified MWW tests; the other
# results were computed as if there were no ties. Recompute the correct
# zstatistics and pvalues for the copper and zinc data.
test < c(rep(c("gehan", "logrank", "petopeto"), 2), "petopeto",
"normal.scores.1", "normal.scores.2", "normal.scores.2")
variance < c(rep("permutation", 3), rep("hypergeometric", 3),
"asymptotic", rep("permutation", 2), "hypergeometric")
stats.mat < matrix(as.numeric(NA), ncol = 4, nrow = 10)
for(i in 1:10) {
dum.list < with(Millard.Deverel.88.df,
twoSampleLinearRankTestCensored(
x = Cu[Zone == "Basin.Trough"],
x.censored = Cu.censored[Zone == "Basin.Trough"],
y = Cu[Zone == "Alluvial.Fan"],
y.censored = Cu.censored[Zone == "Alluvial.Fan"],
test = test[i], variance = variance[i]))
stats.mat[i, 1:2] < c(dum.list$statistic["z"], dum.list$p.value)
dum.list < with(Millard.Deverel.88.df,
twoSampleLinearRankTestCensored(
x = Zn[Zone == "Basin.Trough"],
x.censored = Zn.censored[Zone == "Basin.Trough"],
y = Zn[Zone == "Alluvial.Fan"],
y.censored = Zn.censored[Zone == "Alluvial.Fan"],
test = test[i], variance = variance[i]))
stats.mat[i, 3:4] < c(dum.list$statistic["z"], dum.list$p.value)
}
dimnames(stats.mat) < list(paste(test, variance, sep = "."),
c("Cu.Z", "Cu.p.value", "Zn.Z", "Zn.p.value"))
round(stats.mat, 2)
# Cu.Z Cu.p.value Zn.Z Zn.p.value
#gehan.permutation 0.87 0.38 2.49 0.01
#logrank.permutation 0.79 0.43 1.75 0.08
#petopeto.permutation 0.92 0.36 2.42 0.02
#gehan.hypergeometric 0.71 0.48 2.35 0.02
#logrank.hypergeometric 0.51 0.61 1.69 0.09
#petopeto.hypergeometric 1.03 0.30 2.59 0.01
#petopeto.asymptotic 0.90 0.37 2.37 0.02
#normal.scores.1.permutation 0.94 0.34 2.37 0.02
#normal.scores.2.permutation 0.98 0.33 2.39 0.02
#normal.scores.2.hypergeometric 1.28 0.20 2.48 0.01
#
# Clean up
#
rm(test, variance, stats.mat, i, dum.list)