MaximallySelectedStatisticsTests {coin} | R Documentation |

Testing the independence of two sets of variables measured on arbitrary scales against cutpoint alternatives.

## S3 method for class 'formula' maxstat_test(formula, data, subset = NULL, weights = NULL, ...) ## S3 method for class 'table' maxstat_test(object, ...) ## S3 method for class 'IndependenceProblem' maxstat_test(object, teststat = c("maximum", "quadratic"), distribution = c("asymptotic", "approximate", "none"), minprob = 0.1, maxprob = 1 - minprob, ...)

`formula` |
a formula of the form |

`data` |
an optional data frame containing the variables in the model formula. |

`subset` |
an optional vector specifying a subset of observations to be used. Defaults
to |

`weights` |
an optional formula of the form |

`object` |
an object inheriting from classes |

`teststat` |
a character, the type of test statistic to be applied: either a maximum
statistic ( |

`distribution` |
a character, the conditional null distribution of the test statistic can be
approximated by its asymptotic distribution ( |

`minprob` |
a numeric, a fraction between 0 and 0.5 specifying that cutpoints only
greater than the |

`maxprob` |
a numeric, a fraction between 0.5 and 1 specifying that cutpoints only
smaller than the |

`...` |
further arguments to be passed to |

`maxstat_test`

provides generalized maximally selected statistics. The
family of maximally selected statistics encompasses a large collection of
procedures used for the estimation of simple cutpoint models including, but
not limited to, maximally selected *chi^2* statistics, maximally
selected Cochran-Armitage statistics, maximally selected rank statistics and
maximally selected statistics for multiple covariates. A general description
of these methods is given by Hothorn and Zeileis (2008).

The null hypothesis of independence, or conditional independence given
`block`

, between `y1`

, ..., `yq`

and `x1`

, ...,
`xp`

is tested against cutpoint alternatives. All possible partitions
into two groups are evaluated for each unordered covariate `x1`

, ...,
`xp`

, whereas only order-preserving binary partitions are evaluated for
ordered or numeric covariates. The cutpoint is then a set of levels defining
one of the two groups.

If both response and covariate is univariable, say `y1`

and `x1`

,
this procedure is known as maximally selected *chi^2* statistics
(Miller and Siegmund, 1982) when `y1`

is a binary factor and `x1`

is
a numeric variable, and as maximally selected rank statistics when `y1`

is a rank transformed numeric variable and `x1`

is a numeric variable
(Lausen and Schumacher, 1992). Lausen *et al.* (2004) introduced
maximally selected statistics for a univariable numeric response and multiple
numeric covariates `x1`

, ..., `xp`

.

If, say, `y1`

and/or `x1`

are ordered factors, the default scores,
`1:nlevels(y1)`

and `1:nlevels(x1)`

respectively, can be altered
using the `scores`

argument (see `independence_test`

); this
argument can also be used to coerce nominal factors to class `"ordered"`

.
If both, say, `y1`

and `x1`

are ordered factors, a linear-by-linear
association test is computed and the direction of the alternative hypothesis
can be specified using the `alternative`

argument. The particular
extension to the case of a univariable ordered response and a univariable
numeric covariate was given by Betensky and Rabinowitz (1999) and
is known as maximally selected Cochran-Armitage statistics.

The conditional null distribution of the test statistic is used to obtain
*p*-values and an asymptotic approximation of the exact distribution is
used by default (`distribution = "asymptotic"`

). Alternatively, the
distribution can be approximated via Monte Carlo resampling by setting
`distribution`

to `"approximate"`

. See `asymptotic`

and
`approximate`

for details.

An object inheriting from class `"IndependenceTest"`

.

Starting with coin version 1.1-0, maximum statistics and quadratic forms
can no longer be specified using `teststat = "maxtype"`

and
`teststat = "quadtype"`

respectively (as was used in versions prior to
0.4-5).

Betensky, R. A. and Rabinowitz, D. (1999). Maximally selected
*chi^2* statistics for *k x 2* tables.
*Biometrics* **55**(1), 317–320.
doi: 10.1111/j.0006-341X.1999.00317.x

Hothorn, T. and Lausen, B. (2003). On the exact distribution of maximally
selected rank statistics. *Computational Statistics & Data Analysis*
**43**(2), 121–137. doi: 10.1016/S0167-9473(02)00225-6

Hothorn, T. and Zeileis, A. (2008). Generalized maximally selected
statistics. *Biometrics* **64**(4), 1263–1269.
doi: 10.1111/j.1541-0420.2008.00995.x

Lausen, B., Hothorn, T., Bretz, F. and Schumacher, M. (2004). Assessment of
optimal selected prognostic factors. *Biometrical Journal* **46**(3),
364–374. doi: 10.1002/bimj.200310030

Lausen, B. and Schumacher, M. (1992). Maximally selected rank statistics.
*Biometrics* **48**(1), 73–85. doi: 10.2307/2532740

Miller, R. and Siegmund, D. (1982). Maximally selected chi square
statistics. *Biometrics* **38**(4), 1011–1016.
doi: 10.2307/2529881

MÃ¼ller, J. and Hothorn, T. (2004). Maximally selected
two-sample statistics as a new tool for the identification and assessment of
habitat factors with an application to breeding bird communities in oak
forests. *European Journal of Forest Research* **123**(3), 219–228.
doi: 10.1007/s10342-004-0035-5

## Tree pipit data (Mueller and Hothorn, 2004) ## Asymptotic maximally selected statistics maxstat_test(counts ~ coverstorey, data = treepipit) ## Asymptotic maximally selected statistics ## Note: all covariates simultaneously mt <- maxstat_test(counts ~ ., data = treepipit) mt@estimates$estimate ## Malignant arrythmias data (Hothorn and Lausen, 2003, Sec. 7.2) ## Asymptotic maximally selected statistics maxstat_test(Surv(time, event) ~ EF, data = hohnloser, ytrafo = function(data) trafo(data, surv_trafo = function(y) logrank_trafo(y, ties.method = "Hothorn-Lausen"))) ## Breast cancer data (Hothorn and Lausen, 2003, Sec. 7.3) ## Asymptotic maximally selected statistics data("sphase", package = "TH.data") maxstat_test(Surv(RFS, event) ~ SPF, data = sphase, ytrafo = function(data) trafo(data, surv_trafo = function(y) logrank_trafo(y, ties.method = "Hothorn-Lausen"))) ## Job satisfaction data (Agresti, 2002, p. 288, Tab. 7.8) ## Asymptotic maximally selected statistics maxstat_test(jobsatisfaction) ## Asymptotic maximally selected statistics ## Note: 'Job.Satisfaction' and 'Income' as ordinal maxstat_test(jobsatisfaction, scores = list("Job.Satisfaction" = 1:4, "Income" = 1:4))

[Package *coin* version 1.4-1 Index]