maxstat.test {maxstat} | R Documentation |
Maximally Selected Rank and Statistics
Description
Performs a test of independence of a response and one or more covariables using maximally selected rank statistics.
Usage
## S3 method for class 'data.frame'
maxstat.test(formula, data, subset, na.action, ...)
maxstat(y, x=NULL, weights = NULL, smethod=c("Wilcoxon", "Median",
"NormalQuantil","LogRank", "Data"), pmethod=c("none", "Lau92",
"Lau94", "exactGauss", "HL", "condMC", "min"), iscores=(pmethod=="HL"),
minprop = 0.1, maxprop=0.9, alpha = NULL, keepxy=TRUE, ...)
Arguments
y |
numeric vector of data values, dependent variable. |
x |
numeric vector of data values, independent variable. |
weights |
an optional numeric vector of non-negative weights, summing to the number of observations. |
smethod |
kind of statistic to be computed, i.e. defines the scores to be used for computing the statistic. |
pmethod |
kind of p-value approximation to be used. |
iscores |
logical: should the scores be mapped into integers
|
minprop |
at least |
maxprop |
not more than |
alpha |
significance niveau, the appropriate quantile is computed if
|
keepxy |
logical: return |
formula |
a formula describing the model to be tested of the form
|
data |
an data frame containing the variables in the
model formula. |
subset |
an optional vector specifying a subset of observations to be used. |
na.action |
a function which indicates what should happen when
the data contain |
... |
additional parameters to be passed to
|
Details
The assessment of the predictive power of a variable x
for a
dependent variable y
can be determined by a maximally selected rank
statistic.
smethod
determines the kind of statistic to be used.
Wilcoxon
and Median
denote maximally selected
Wilcoxon and Median statistics. NormalQuantile
and
LogRank
denote v.d. Waerden and log-rank
scores.
pmethod
specifies which kind of approximation of the p-value should
be used. Lau92
is the limiting distribution by a Brownian bridge
(see Lausen and Schumacher, 1992, and pLausen92
),
Lau94
the approximation based on an improved Bonferroni
inequality (see Lausen, Sauerbrei and Schumacher, 1994, and pLausen94
).
exactGauss
returns the exact p-value for a maximally selected Gauss
statistic, see Hothorn and Lausen (2003).
HL
is a small sample approximation based on the Streitberg-R\"ohmel
algorithm (see pperm
) introduced by Hothorn and
Lausen (2003). This requires integer
valued scores. For v. d. Waerden and Log-rank scores we try to find
integer valued scores having the same shape. This results in slightly
different scores (and a different test), the procedure is described in
Hothorn (2001) and Hothorn and Lausen (2003).
All the approximations are known to be conservative, so min
gives
the minimum p-value of all procedures.
condMC
simulates the distribution via conditional Monte-Carlo.
For survival problems, i.e. using a maximally selected log-rank statistic,
the interface is similar to survfit
. The depended
variable is a survival object Surv(time, event)
. The argument
event
may be a numeric vector of 0
(alive) and 1
(dead) or a vector of logicals with TRUE
indicating death.
If more than one covariable is specified in the right hand side of
formula
(or if x
is a matrix or data frame), the variable with
smallest p-value is selected and the p-value for the global test problem of
independence of y
and every variable on the right hand side is
returned (see Lausen et al., 2002).
Value
An object of class maxtest
or mmaxtest
(if more than one
covariable was specified) containing the following components
is returned:
statistic |
the value of the test statistic. |
p.value |
the p-value for the test. |
smethod |
the type of test applied. |
pmethod |
the type of p-value approximation applied. |
estimate |
the estimated cutpoint (of |
maxstats |
a list of |
whichmin |
an integer specifying the element of |
p.value |
the p-value of the global test. |
univp.values |
the p-values for each of the variables under test. |
cm |
the correlation matrix the p-value is based on. |
plot.maxtest
and print.maxtest
can be used for
plotting and printing. If keepxy = TRUE
, there are elements y
and x
giving the response and independent variable.
References
Hothorn, T. and Lausen, B. (2003). On the Exact Distribution of Maximally Selected Rank Statistics. Computational Statistics & Data Analysis, 43, 121–137.
Lausen, B. and Schumacher, M. (1992). Maximally Selected Rank Statistics. Biometrics, 48, 73–85
Lausen, B., Sauerbrei, W. and Schumacher, M. (1994). Classification and Regression Trees (CART) used for the exploration of prognostic factors measured on different scales. in: P. Dirschedl and R. Ostermann (Eds), Computational Statistics, Heidelberg, Physica-Verlag, 483–496
Hothorn, T. (2001). On Exact Rank Tests in R. R News, 1, 11–12
Lausen, B., Hothorn, T., Bretz, F. and Schmacher, M. (2004). Assessment of Optimally Selected Prognostic Factors. Biometrical Journal, 46(3), 364–374.
Examples
set.seed(29)
x <- sort(runif(20))
y <- c(rnorm(10), rnorm(10, 2))
mydata <- data.frame(cbind(x,y))
mod <- maxstat.test(y ~ x, data=mydata, smethod="Wilcoxon", pmethod="HL",
minprop=0.25, maxprop=0.75, alpha=0.05)
print(mod)
plot(mod)
# adjusted for more than one prognostic factor.
library("survival")
mstat <- maxstat.test(Surv(time, cens) ~ IPI + MGE, data=DLBCL,
smethod="LogRank", pmethod="exactGauss",
abseps=0.01)
plot(mstat)
### sphase
set.seed(29)
data("sphase", package = "TH.data")
maxstat.test(Surv(RFS, event) ~ SPF, data=sphase, smethod="LogRank",
pmethod="Lau94")
maxstat.test(Surv(RFS, event) ~ SPF, data=sphase, smethod="LogRank",
pmethod="Lau94", iscores=TRUE)
maxstat.test(Surv(RFS, event) ~ SPF, data=sphase, smethod="LogRank",
pmethod="HL")
maxstat.test(Surv(RFS, event) ~ SPF, data=sphase, smethod="LogRank",
pmethod="condMC", B = 9999)
plot(maxstat.test(Surv(RFS, event) ~ SPF, data=sphase, smethod="LogRank"))