makedata {rsae} | R Documentation |
Synthetic Data Generation for the Basic Unit-Level SAE Model
Description
This function generates synthetic data (possibly contaminated by outliers) for the basic unit-level SAE model.
Usage
makedata(seed = 1024, intercept = 1, beta = 1, n = 4, g = 20, areaID = NULL,
ve = 1, ve.contam = 41, ve.epsilon = 0, vu = 1, vu.contam = 41,
vu.epsilon = 0)
Arguments
seed |
|
intercept |
|
beta |
|
n |
|
g |
|
areaID |
|
ve |
|
ve.contam |
|
ve.epsilon |
|
vu |
|
vu.contam |
|
vu.epsilon |
|
Details
Let y_i
denote an area-specific n_i
-vector of
the response variable for the areas i = 1,..., g
. Define a
(n_i \times p)
-matrix X_i
of realizations
from the std. normal distribution, N(0,1)
, and let
\beta
denote a p
-vector of regression coefficients. Now, the
y_i
are drawn using the law
y_i \sim N(X_i\beta, v_e I_i + v_u J_i)
with v_e
and
v_u
the variances of the model error and random-effect
variance, respectively, and I_i
and J_i
denoting
the identity matrix and matrix of ones, respectively.
In addition, we allow the distribution of the model/residual and
area-level random effect to be contaminated (cf. Stahel and Welsh, 1997).
Notably, the laws of e_{i,j}
and u_i
are replaced
by the Tukey-Huber contamination mixture:
-
e_{i,j} \sim (1-\epsilon^{ve})N(0,v_e) + \epsilon^{ve}N(0, v_e^{\epsilon})
-
u_{i} \sim (1-\epsilon^{vu})N(0,v_u) + \epsilon^{vu}N(0, v_u^{\epsilon})
where \epsilon^{ve}
and
\epsilon^{vu}
regulate the degree of contamination;
v_e^{\epsilon}
and v_u^{\epsilon}
define the variance of the contamination part of the mixture distribution.
Four different contamination setups are possible:
no contamination (i.e.,
ve.epsilon = vu.epsilon = 0
),contaminated model error (i.e.,
ve.epsilon != 0
andvu.epsilon = 0
),contaminated random effect (i.e.,
ve.epsilon = 0
andvu.epsilon != 0
),both are conaminated (i.e.,
ve.epsilon != 0
andvu.epsilon != 0
).
Value
An instance of the class saemodel
.
References
Schoch, T. (2012). Robust Unit-Level Small Area Estimation: A Fast Algorithm for Large Datasets. Austrian Journal of Statistics 41, 243–265. doi:10.17713/ajs.v41i4.1548
Stahel, W. A. and A. Welsh (1997). Approaches to robust estimation in the simplest variance components model. Journal of Statistical Planning and Inference 57, 295–319. doi:10.1016/S0378-3758(96)00050-X
See Also
Examples
# generate a model with synthetic data
model <- makedata()
model
# summary of the model
summary(model)