R: Simulate a Time-varying Panel With a Group Structure in the...

sim_tv_DGP {PAGFL}

R Documentation

Simulate a Time-varying Panel With a Group Structure in the Slope Coefficients

Description

Construct a time-varying panel data set subject to a group structure in the slope coefficients with optional AR(1) innovations.

Usage

sim_tv_DGP(
  N = 50,
  n_periods = 40,
  intercept = TRUE,
  p = 1,
  n_groups = 3,
  d = 3,
  dynamic = FALSE,
  group_proportions = NULL,
  error_spec = "iid",
  locations = NULL,
  scales = NULL,
  polynomial_coef = NULL,
  sd_error = 1,
  DGP = lifecycle::deprecated()
)

Arguments

`N`	the number of cross-sectional units. Default is 50.
`n_periods`	the number of simulated time periods `T`. Default is 40.
`intercept`	logical. If `TRUE`, a time-varying intercept is generated.
`p`	the number of simulated explanatory variables
`n_groups`	the number of groups `K`. Default is 3.
`d`	the polynomial degree used to construct the time-varying coefficients.
`dynamic`	Logical. If `TRUE`, the panel includes one stationary autoregressive lag of `y_{it}` as a regressor. Default is `FALSE`.
`group_proportions`	a numeric vector of length `n_groups` indicating size of each group as a fraction of `N`. If `NULL`, all groups are of size `N / K`. Default is `NULL`.
`error_spec`	options include `"iid"` for `iid` errors. `"AR"` for an `AR(1)` error process with an autoregressive coefficient of 0.5. Default is `"iid"`.
`locations`	a `p \times K` matrix of location parameters of a logistic distribution function used to construct the time-varying coefficients. If left empty, the location parameters are drawn randomly. Default is `NULL`.
`scales`	a `p \times K` matrix of scale parameters of a logistic distribution function used to construct the time-varying coefficients. If left empty, the location parameters are drawn randomly. Default is `NULL`.
`polynomial_coef`	a `p \times d \times K` array of coefficients for a the polynomials used to construct the time-varying coefficients. If left empty, the location parameters are drawn randomly. Default is `NULL`.
`sd_error`	standard deviation of the cross-sectional errors. Default is 1.
`DGP`	the data generating process. Options are 1 generates a trend only. 2 simulates a trend and an additional exogenous explanatory variable. 1 draws a dynamic panel data model with one `AR` lag.

Details

The scalar dependent variable y_{it} is generated according to the following time-varying grouped panel data model

y_{it} = \gamma_i + \beta^\prime_{it} x_{it} + u_{it}, \quad i = 1, \dots, N, \; t = 1, \dots, T,

where \gamma_i is an individual fixed effect and x_{it} is a p \times 1 vector of explanatory variables. The coefficient vector \beta_i = \{\beta_{i1}^\prime, \dots, \beta_{iT}^\prime \}^\prime is subject to the group pattern

\beta_i \left( \frac{t}{T} \right) = \sum_{k = 1}^K \alpha_k \left( \frac{t}{T} \right) \bold{1} \{i \in G_k \},

with \cup_{k = 1}^K G_k = \{1, \dots, N\}, G_k \cap G_j = \emptyset and \sup_{v \in [0,1]} \left( \| \alpha_k(v) - \alpha_j(v) \| \right) \neq 0 for any k \neq j, k = 1, \dots, K. The total number of groups K is determined by n_groups.

The predictors are simulated as:

x_{it,j} = 0.2 \gamma_i + e_{it,j}, \quad \gamma_i,e_{it,j} \sim i.i.d. N(0, 1), \quad j = \{1, \dots, p\},

where e_{it,j} denotes a series of innovations. \gamma_i and e_i are independent of each other.

The errors u_{it} feature a iid standard normal distribution.

In case locations = NULL, the location parameters are drawn from \sim U[0.3, 0.9]. In case scales = NULL, the scale parameters are drawn from \sim U[0.01, 0.09]. In case polynomial_coef = NULL, the polynomial coefficients are drawn from \sim U[-20, 20] and normalized so that all coefficients of one polynomial sum up to 1. The final coefficient function follows as \alpha_k (t/T) = 3 * F(t/T, location, scale) + \sum_{j=1}^d a_j (t/T)^j, where F(\cdot, location, scale) denotes a cumulative logistic distribution function and a_j reflects a polynomial coefficient.

Value

A list holding

`alpha`	a `T \times p \times K` array of group-specific time-varying parameters
`beta`	a `T \times p \times N` array of individual time-varying parameters
`groups`	a vector indicating the group memberships `(g_1, \dots, g_N)`, where `g_i = k` if `i \in` group `k`.
`y`	a `NT \times 1` vector of the dependent variable, with `\bold{y}=(y_1, \dots, y_N)^\prime`, `y_i = (y_{i1}, \dots, y_{iT})^\prime` and the scalar `y_{it}`.
`X`	a `NT \times p` matrix of explanatory variables, with `\bold{X}=(x_1, \dots, x_N)^\prime`, `x_i = (x_{i1}, \dots, x_{iT})^\prime` and the `p \times 1` vector `x_{it}`.
`data`	a `NT \times (p + 1)` data.frame of the outcome and the explanatory variables.

Author(s)

Paul Haimerl

Examples

# Simulate a time-varying panel subject to a time trend and a group structure
sim <- sim_tv_DGP(N = 20, n_periods = 50, intercept = TRUE, p = 1)
y <- sim$y

[Package PAGFL version 1.1.1 Index]