DesignSurvey {capm} | R Documentation |
Survey design
Description
A wraper for svydesign
function from the survey package, to define one of the following survey designs: two-stage cluster, simple (systematic) or stratified. In the first case, weights are calculated considering a sample with probability proportional to size and with replacement for the first stage and a simple random sampling for the second stage. Finite population correction is specified as the population size for each level of sampling.
Usage
DesignSurvey(sample = NULL, psu.ssu = NULL, psu.col = NULL,
ssu.col = NULL, cal.col = NULL, N = NULL, strata = NULL,
cal.N = NULL, ...)
Arguments
sample |
|
psu.ssu |
|
psu.col |
the column of |
ssu.col |
the column of |
cal.col |
the column of |
N |
for simple designs, a |
strata |
for stratified designs, a column of |
cal.N |
population total for the variable to calibrate the estimates. It must be used togheter with |
... |
further arguments passed to |
Details
For two-stage cluster designs, a PSU appearing in both psu.ssu
and in sample
must have the same identifier. SSU identifiers must be unique but can appear more than once if there is more than one observation per SSU. sample
argument must have just the varibles to be estimated plus the variables required to define the design (two-stage cluster or stratified). cal.col
and cal.N
are needed only if estimates will be calibrated. The calibration is based on a population total.
Value
An object of class survey.design.
References
Lumley, T. (2011). Complex surveys: A guide to analysis using R (Vol. 565). Wiley.
Baquero, O. S., Marconcin, S., Rocha, A., & Garcia, R. D. C. M. (2018). Companion animal demography and population management in Pinhais, Brazil. Preventive Veterinary Medicine.
http://oswaldosantos.github.io/capm
Examples
data("cluster_sample")
data("psu_ssu")
## Calibrated two-stage cluster design
design <- DesignSurvey(na.omit(cluster_sample),
psu.ssu = psu_ssu,
psu.col = "census_tract_id",
ssu.col = "interview_id",
cal.col = "number_of_persons",
cal.N = 129445)
## Simple design
# If data in cluster_sample were from a simple design:
design <- DesignSurvey(na.omit(cluster_sample),
N = sum(psu_ssu$hh),
cal.N = 129445)
## Stratified design
# Simulate strata and assume that the data in cluster_design came
# from a stratified design
cluster_sample$strat <- sample(c("urban", "rural"),
nrow(cluster_sample),
prob = c(.95, .05),
replace = TRUE)
cluster_sample$strat_size <- round(sum(psu_ssu$hh) * .95)
cluster_sample$strat_size[cluster_sample$strat == "rural"] <-
round(sum(psu_ssu$hh) * .05)
design <- DesignSurvey(cluster_sample,
N = "strat_size",
strata = "strat",
cal.N = 129445)