costa {DOS} R Documentation

### Description

This data set is from Costa et al. (1993) and it describes 21 welders and 26 potential controls. All are men. The outcome is a measure of genetic damage; specifically, dpc is a measure of DNA-protein cross-links. There are 3 covariates, age, race and smoking. This tiny example is used to illustrate the concepts of multivariate matching in Chapter 8 of Design of Observational Studies. The example is useful because its tiny size permits close inspection of the details of multivariate matching, but its small sample size and limited number of covariates make it highly atypical of matching in observational studies.

### Usage

data("costa")

### Format

A data frame with 47 observations on the following 6 variables.

subject

Within group ID number.

age

Age in years.

race

AA=African-American, C=Caucasian

smoker

Y=yes, N=no

welder

Y=yes/treated, N=no/control

dpc

### Source

The data are from Costa et al. (1993). The data are used as a tiny example in Chapter 8 of Design of Observational Studies.

### References

Costa, M., Zhitkovich, A. and Toniolo, P. (1993). DNA-protein cross-links in welders: molecular implications. Cancer research, 53(3), 460-463.

Rosenbaum, P. R. (2010). Design of Observational Studies. New York: Springer. This example is discussed in Chapter 8.

### Examples

data(costa)
boxplot(costa$dpc~costa$welder,
xlab="Control (N) or Welder (Y)",