R: Survey design

DesignSurvey {capm}

R Documentation

Survey design

Description

A wraper for svydesign function from the survey package, to define one of the following survey designs: two-stage cluster, simple (systematic) or stratified. In the first case, weights are calculated considering a sample with probability proportional to size and with replacement for the first stage and a simple random sampling for the second stage. Finite population correction is specified as the population size for each level of sampling.

Usage

DesignSurvey(sample = NULL, psu.ssu = NULL, psu.col = NULL,
  ssu.col = NULL, cal.col = NULL, N = NULL, strata = NULL,
  cal.N = NULL, ...)

Arguments

`sample`	`data.frame` with sample observations. for two-stage cluster designs, one of the columns must contain unique identifiers for PSU and another column must contain unique identifiers for Secondary Sampling Units (SSU).
`psu.ssu`	`data.frame` with all Primary Sampling Units (PSU). First column contains PSU unique identifiers. Second column contains `numeric` PSU sizes. It is used only for two-stage cluster designs.
`psu.col`	the column of `sample` containing the psu identifiers (for two-stage cluster designs). It is used only for two-stage cluster designs.
`ssu.col`	the column of `sample` containing the ssu identifiers (for two-stage cluster designs). It is used only for two-stage cluster designs.
`cal.col`	the column of `sample` with the variable to calibrate estimates. It must be used together with `cal.N`.
`N`	for simple designs, a `numeric` value representing the total of sampling units in the population. for a stratified design, it is a column of `sample` indicating, for each observation, the total of sampling units in its respective strata. `N` is ignored in two-stage cluster designs.
`strata`	for stratified designs, a column of `sample` indicating the strata memebership of each observation.
`cal.N`	population total for the variable to calibrate the estimates. It must be used togheter with `cal.col`.
`...`	further arguments passed to `svydesign` function.

Details

For two-stage cluster designs, a PSU appearing in both psu.ssu and in sample must have the same identifier. SSU identifiers must be unique but can appear more than once if there is more than one observation per SSU. sample argument must have just the varibles to be estimated plus the variables required to define the design (two-stage cluster or stratified). cal.col and cal.N are needed only if estimates will be calibrated. The calibration is based on a population total.

Value

An object of class survey.design.

References

Lumley, T. (2011). Complex surveys: A guide to analysis using R (Vol. 565). Wiley.

Baquero, O. S., Marconcin, S., Rocha, A., & Garcia, R. D. C. M. (2018). Companion animal demography and population management in Pinhais, Brazil. Preventive Veterinary Medicine.

http://oswaldosantos.github.io/capm

Examples

data("cluster_sample")
data("psu_ssu")

## Calibrated two-stage cluster design
design <- DesignSurvey(na.omit(cluster_sample),
                       psu.ssu = psu_ssu,
                       psu.col = "census_tract_id",
                       ssu.col = "interview_id",
                       cal.col = "number_of_persons",
                       cal.N = 129445)

## Simple design
# If data in cluster_sample were from a simple design:
design <- DesignSurvey(na.omit(cluster_sample), 
                       N = sum(psu_ssu$hh),
                       cal.N = 129445)

## Stratified design
# Simulate strata and assume that the data in cluster_design came
# from a stratified design
cluster_sample$strat <- sample(c("urban", "rural"),
                               nrow(cluster_sample),
                               prob = c(.95, .05),
                               replace = TRUE)
cluster_sample$strat_size <- round(sum(psu_ssu$hh) * .95)
cluster_sample$strat_size[cluster_sample$strat == "rural"] <-
  round(sum(psu_ssu$hh) * .05)
design <- DesignSurvey(cluster_sample,
                       N = "strat_size",
                       strata = "strat",
                       cal.N = 129445)

[Package capm version 0.14.0 Index]