xdesign {survey} | R Documentation |
Crossed effects and other sparse correlations
Description
Defines a design object with multiple dimensions of correlation:
observations that share any of the id
variables are correlated,
or you can supply an adjacency matrix or Matrix to specify which are
correlated. Supports crossed designs (eg multiple raters of multiple
objects) and non-nested observational correlation (eg observations
sharing primary school or secondary school). Has methods for
svymean
, svytotal
, svyglm
(so far).
Usage
xdesign(id = NULL, strata = NULL, weights = NULL, data, fpc = NULL,
adjacency = NULL, overlap = c("unbiased", "positive"), allow.non.binary = FALSE)
Arguments
id |
list of formulas specifying cluster identifiers for each clustering dimension (or |
strata |
Not implemented |
weights |
model formula specifying (sampling) weights |
data |
data frame containing all the variables |
fpc |
Not implemented |
adjacency |
Adjacency matrix or Matrix indicating which pairs of observations are correlated |
overlap |
See details below |
allow.non.binary |
If |
Details
Subsetting for these objects actually drops observations; it is not equivalent to just setting weights to zero as for survey designs. So, for example, a subset of a balanced design will not be a balanced design.
The overlap
option controls double-counting of some variance
terms. Suppose there are two clustering dimensions, ~a
and
~b
. If we compute variance matrices clustered on a
and
clustered on b
and add them, observations that share both
a
and b
will be counted twice, giving a positively
biased estimator. We can subtract off a variance matrix clustered
on combinations of a
and b
to give an unbiased
variance estimator. However, the unbiased estimator is not
guaranteed to be positive definite. In the references, Miglioretti
and Heagerty use the overlap="positive"
estimator and Cameron
et al use the overlap="unbiased"
estimator.
Value
An object of class xdesign
References
Miglioretti D, Heagerty PJ (2007) Marginal modeling of nonnested multilevel data using standard software. Am J Epidemiol 165(4):453-63
Cameron, A. C., Gelbach, J. B., & Miller, D. L. (2011). Robust Inference With Multiway Clustering. Journal of Business & Economic Statistics, 29(2), 238-249.
https://notstatschat.rbind.io/2021/09/18/crossed-clustering-and-parallel-invention/
See Also
Examples
## With one clustering dimension, is close to the with-replacement
## survey estimator, but not identical unless clusters are equal size
data(api)
dclus1r<-svydesign(id=~dnum, weights=~pw, data=apiclus1)
xclus1<-xdesign(id=list(~dnum), weights=~pw, data=apiclus1)
xclus1
svymean(~enroll,dclus1r)
svymean(~enroll,xclus1)
data(salamander)
xsalamander<-xdesign(id=list(~Male, ~Female), data=salamander,
overlap="unbiased")
xsalamander
degf(xsalamander)