svyweight {svyweight} | R Documentation |
svyweight: Quick and Flexible Rake Weighting
Description
svyweight is a package for quickly and flexibly calculating
rake weights (also know as rim weights). It is
designed to interact with survey.design
objects generated via
survey::svydesign()
, and other to otherwise build on functionalities
from Thomas Lumley's 'survey' package.
Rake weighting concepts
Post-stratification weights are commonly used in survey research to ensure that sample is representative of the population it is drawn from, in cases where some people selected for inclusion in a sample might decline to participate. To calculate post-stratification weights, observed categorical variables in a survey dataset (usually demographic variables) must be matched with "targets" (typically known population demographics from census data). Survey respondents from underrepresented categories are upweighted, while respondents from overrepresented categories are downweighted.
svyweight implements "rake" or "rim" weighting (sometimes known as iterative proportional fitting). This is a widely-used method for simultaneously calculating weights on multiple variables, when no join distribution for these variables is known. For example, population data on past vote (from election results) and age (from the census) are generally known. However, as the joint distribution of past vote and age is not generally known, a technique such as rake weighting must be used to apply weights on both variables simultaneously.
Package features
The core function in svyweight is rakesvy()
(and the related rakew8()
. This takes calculates post-stratification weights
given A) data frame or a survey.design
object generated by svydesign()
,
and B) a set of weighting targets The command is designed to make weighting as simple as
possible, with the following features:
Weighting to either counts or percentage targets
Allowing specification of targets as vectors, matrices, or data frames
Accepting targets of 0 (equivalent to dropping cases from analysis)
Allowing targets to be quickly rebased a specified sample size
Flexibly matching targets to the correct variables in a dataset
Dynamically specifying weight targets based on recodes of variables in observed data
The package does this in part by introducing the w8margin
object class. A w8margin is a desired raw count of categories for a
variable, in the format required for actually computing weights.
However, this format is somewhat cumbersome to specify manually. The package includes methods
for converting named vectors, matrices, and data frames to w8margin object;
[rakesvy()]
and rakew8()
call these methods automatically.
At present, the core weighting calculations are actually performed via the
'survey' package's survey::rake()
function. This might change
with future releases, although the basic approach to iterative
weighting is not expected to change.
The package is under development. Contributions to the package, or suggestions for additional features, are gratefully accepted via email or GitHub.
Author(s)
Ben Mainwaring (mainwaringb@gmail.com, https://www.linkedin.com/in/mainwaringb)
References
Lumley, Thomas. 2011. Complex Surveys: A Guide to Analysis Using R. New York: Wiley.
See Also
Package GitHub repository: https://github.com/mainwaringb/svyweight