sim_DGP {PAGFL} | R Documentation |
Simulate a Panel With a Group Structure in the Slope Coefficients
Description
Construct a static or dynamic, exogenous or endogenous panel data set subject to a group structure in the slope coefficients with optional AR(1)
or GARCH(1,1)
innovations.
Usage
sim_DGP(
N = 50,
n_periods = 40,
p = 2,
n_groups = 3,
group_proportions = NULL,
error_spec = "iid",
dynamic = FALSE,
dyn_panel = lifecycle::deprecated(),
q = NULL,
alpha_0 = NULL
)
Arguments
N |
the number of cross-sectional units. Default is 50. |
n_periods |
the number of simulated time periods |
p |
the number of explanatory variables. Default is 2. |
n_groups |
the number of groups |
group_proportions |
a numeric vector of length |
error_spec |
options include
Default is |
dynamic |
Logical. If |
dyn_panel |
|
q |
the number of exogenous instruments when a panel with endogenous regressors is to be simulated. If panel data set with exogenous regressors is supposed to be generated, pass |
alpha_0 |
a |
Details
The scalar dependent variable y_{it}
is generated according to the following grouped panel data model
y_{it} = \gamma_i + \beta_i^\prime x_{it} + u_{it}, \quad i = \{1, \dots, N\}, \quad t = \{1, \dots, T\}.
\gamma_i
represents individual fixed effects and x_{it}
a p \times 1
vector of regressors.
The individual slope coefficient vectors \beta_i
are subject to a group structure
\beta_i = \sum_{k = 1}^K \alpha_k \bold{1} \{i \in G_k\},
with \cup_{k = 1}^K G_k = \{1, \dots, N\}
, G_k \cap G_j = \emptyset
and \| \alpha_k - \alpha_j \| \neq 0
for any k \neq j
, k = 1, \dots, K
. The total number of groups K
is determined by n_groups
.
If a panel data set with exogenous regressors is generated (set q = NULL
), the explanatory variables are simulated according to
x_{it,j} = 0.2 \gamma_i + e_{it,j}, \quad \gamma_i,e_{it,j} \sim i.i.d. N(0, 1), \quad j = \{1, \dots, p\},
where e_{it,j}
denotes a series of innovations. \gamma_i
and e_i
are independent of each other.
In case alpha_0 = NULL
, the group-level slope parameters \alpha_{k}
are drawn from \sim U[-2, 2]
.
If a dynamic panel is specified (dynamic = TRUE
), the AR
coefficients \beta^{\text{AR}}_i
are drawn from a uniform distribution with support (-1, 1)
and x_{it,j} = e_{it,j}
.
Moreover, the individual fixed effects enter the dependent variable via (1 - \beta^{\text{AR}}_i) \gamma_i
to account for the autoregressive dependency.
We refer to Mehrabani (2023, sec 6) for details.
When specifying an endogenous panel (set q
to q \geq p
), the e_{it,j}
correlate with the cross-sectional innovations u_{it}
by a magnitude of 0.5 to produce endogenous regressors (\text{E}(u|X) \neq 0
). However, the endogenous regressors can be accounted for by exploiting the q
instruments in \bold{Z}
, for which \text{E}(u|Z) = 0
holds.
The instruments and the first stage coefficients are generated in the same fashion as \bold{X}
and \bold{\alpha}
when q = NULL
.
The function nests, among other, the DGPs employed in the simulation study of Mehrabani (2023, sec. 6).
Value
A list holding
alpha |
the |
groups |
a vector indicating the group memberships |
y |
a |
X |
a |
Z |
a |
data |
a |
Author(s)
Paul Haimerl
References
Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.
Examples
# Simulate DGP 1 from Mehrabani (2023, sec. 6)
alpha_0_DGP1 <- matrix(c(0.4, 1, 1.6, 1.6, 1, 0.4), ncol = 2)
DGP1 <- sim_DGP(
N = 50, n_periods = 20, p = 2, n_groups = 3,
group_proportions = c(.4, .3, .3), alpha_0 = alpha_0_DGP1
)