matchControls {e1071}R Documentation

Find Matched Control Group

Description

Finds controls matching the cases as good as possible.

Usage

matchControls(formula, data = list(), subset, contlabel = "con",
               caselabel = NULL, dogrep = TRUE, replace = FALSE)

Arguments

formula

A formula indicating cases, controls and the variables to be matched. Details are described below.

data

an optional data frame containing the variables in the model. By default the variables are taken from the environment which matchControls is called from.

subset

an optional vector specifying a subset of observations to be used in the matching process.

contlabel

A string giving the label of the control group.

caselabel

A string giving the labels of the cases.

dogrep

If TRUE, then contlabel and contlabel are matched using grep, else string comparison (exact equality) is used.

replace

If FALSE, then every control is used only once.

Details

The left hand side of the formula must be a factor determining whether an observation belongs to the case or the control group. By default, all observations where a grep of contlabel matches, are used as possible controls, the rest is taken as cases. If caselabel is given, then only those observations are taken as cases. If dogrep = TRUE, then both contlabel and caselabel can be regular expressions.

The right hand side of the formula gives the variables that should be matched. The matching is done using the daisy distance from the cluster package, i.e., a model frame is built from the formula and used as input for daisy. For each case, the nearest control is selected. If replace = FALSE, each control is used only once.

Value

Returns a list with components

cases

Row names of cases.

controls

Row names of matched controls.

factor

A factor with 2 levels indicating cases and controls (the rest is set to NA.

Author(s)

Friedrich Leisch

Examples

Age.case <- 40 + 5 * rnorm(50)
Age.cont <- 45 + 10 * rnorm(150)
Age <- c(Age.case, Age.cont)

Sex.case <- sample(c("M", "F"), 50, prob = c(.4, .6), replace = TRUE)
Sex.cont <- sample(c("M", "F"), 150, prob = c(.6, .4), replace = TRUE)
Sex <- as.factor(c(Sex.case, Sex.cont))

casecont <- as.factor(c(rep("case", 50), rep("cont", 150)))

## now look at the group properties:
boxplot(Age ~ casecont)
barplot(table(Sex, casecont), beside = TRUE)

m <- matchControls(casecont ~ Sex + Age)

## properties of the new groups:
boxplot(Age ~ m$factor)
barplot(table(Sex, m$factor))

[Package e1071 version 1.7-14 Index]