grouped_plm {PAGFL} | R Documentation |
Grouped Panel Data Model
Description
Estimate a grouped panel data model given an observed group structure. Slope parameters are homogeneous within groups but heterogeneous across groups. This function supports both static and dynamic panel data models, with or without endogenous regressors.
Usage
grouped_plm(
formula,
data,
groups,
index = NULL,
n_periods = NULL,
method = "PLS",
Z = NULL,
bias_correc = FALSE,
rho = 0.07 * log(N * n_periods)/sqrt(N * n_periods),
verbose = TRUE,
parallel = TRUE,
...
)
## S3 method for class 'gplm'
print(x, ...)
## S3 method for class 'gplm'
formula(x, ...)
## S3 method for class 'gplm'
df.residual(object, ...)
## S3 method for class 'gplm'
summary(object, ...)
## S3 method for class 'gplm'
coef(object, ...)
## S3 method for class 'gplm'
residuals(object, ...)
## S3 method for class 'gplm'
fitted(object, ...)
Arguments
formula |
a formula object describing the model to be estimated. |
data |
a |
groups |
a numerical or character vector of length |
index |
a character vector holding two strings. The first string denotes the name of the index variable identifying the cross-sectional unit |
n_periods |
the number of observed time periods |
method |
the estimation method. Options are
Default is |
Z |
a |
bias_correc |
logical. If |
rho |
a tuning parameter balancing the fitness and penalty terms in the IC. If left unspecified, the heuristic |
verbose |
logical. If |
parallel |
logical. If |
... |
ellipsis |
x |
of class |
object |
of class |
Details
Consider the grouped panel data model
y_{it} = \gamma_i + \beta^\prime_{i} x_{it} + \epsilon_{it}, \quad i = 1, \dots, N, \; t = 1, \dots, T,
where y_{it}
is the scalar dependent variable, \gamma_i
is an individual fixed effect, x_{it}
is a p \times 1
vector of explanatory variables, and \epsilon_{it}
is a zero mean error.
The coefficient vector \beta_i
is subject to the observed group pattern
\beta_i = \sum_{k = 1}^K \alpha_k \bold{1} \{i \in G_k \},
with \cup_{k = 1}^K G_k = \{1, \dots, N\}
, G_k \cap G_j = \emptyset
and \| \alpha_k - \alpha_j \| \neq 0
for any k \neq j
, k = 1, \dots, K
.
Using PLS, the group-specific coefficients for group k
are obtained via OLS
\hat{\alpha}_k = \left( \sum_{i \in G_k} \sum_{t = 1}^T \tilde{x}_{it} \tilde{x}_{it}^\prime \right)^{-1} \sum_{i \in G_k} \sum_{t = 1}^T \tilde{x}_{it} \tilde{y}_{it},
where \tilde{a}_{it} = a_{it} - T^{-1} \sum_{t=1}^T a_{it}
, a = \{y, x\}
to concentrate out the individual fixed effects \gamma_i
(within-transformation).
In case of PGMM, the slope coefficients are derived as
\hat{\alpha}_k = \left( \left[ \sum_{i \in G_k} T^{-1} \sum_{t = 1}^T z_{it} \Delta x_{it} \right]^\prime W_k \left[ \sum_{i \in G_k} T^{-1} \sum_{t = 1}^T z_{it} \Delta x_{it} \right] \right)^{-1}
\quad \quad \left[ \sum_{i \in G_k} T^{-1} \sum_{t = 1}^T z_{it} \Delta x_{it} \right]^\prime W_k \left[ \sum_{i \in G_k} T^{-1} \sum_{t = 1}^T z_{it} \Delta y_{it} \right],
where W_k
is a q \times q
p.d. symmetric weight matrix and \Delta
denotes the first difference operator \Delta x_{it} = x_{it} - x_{it-1}
(first-difference transformation).
Value
An object of class gplm
holding
model |
a |
coefficients |
a |
groups |
a |
residuals |
a vector of residuals of the demeaned model, |
fitted |
a vector of fitted values of the demeaned model, |
args |
a |
IC |
a |
call |
the function call. |
A gplm
object has print
, summary
, fitted
, residuals
, formula
, df.residual
, and coef
S3 methods.
Author(s)
Paul Haimerl
References
Dhaene, G., & Jochmans, K. (2015). Split-panel jackknife estimation of fixed-effect models. The Review of Economic Studies, 82(3), 991-1030. doi:10.1093/restud/rdv007. Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.
Examples
# Simulate a panel with a group structure
sim <- sim_DGP(N = 20, n_periods = 80, p = 2, n_groups = 3)
y <- sim$y
X <- sim$X
groups <- sim$groups
df <- cbind(y = c(y), X)
# Estimate the grouped panel data model
estim <- grouped_plm(y ~ ., data = df, groups = groups, n_periods = 80, method = "PLS")
summary(estim)
# Lets pass a panel data set with explicit cross-sectional and time indicators
i_index <- rep(1:20, each = 80)
t_index <- rep(1:80, 20)
df <- data.frame(y = c(y), X, i_index = i_index, t_index = t_index)
estim <- grouped_plm(
y ~ ., data = df, index = c("i_index", "t_index"), groups = groups, method = "PLS"
)
summary(estim)