pagfl {PAGFL} | R Documentation |
Pairwise Adaptive Group Fused Lasso
Description
Estimate panel data models with a latent group structure using the pairwise adaptive group fused Lasso (PAGFL) by Mehrabani (2023). The PAGFL jointly identifies the group structure and group-specific slope parameters. The function supports both static and dynamic panels, with or without endogenous regressors.
Usage
pagfl(
formula,
data,
index = NULL,
n_periods = NULL,
lambda,
method = "PLS",
Z = NULL,
min_group_frac = 0.05,
bias_correc = FALSE,
kappa = 2,
max_iter = 5000,
tol_convergence = 1e-08,
tol_group = 0.001,
rho = 0.07 * log(N * n_periods)/sqrt(N * n_periods),
varrho = max(sqrt(5 * N * n_periods * p)/log(N * n_periods * p) - 7, 1),
verbose = TRUE,
parallel = TRUE,
...
)
## S3 method for class 'pagfl'
print(x, ...)
## S3 method for class 'pagfl'
formula(x, ...)
## S3 method for class 'pagfl'
df.residual(object, ...)
## S3 method for class 'pagfl'
summary(object, ...)
## S3 method for class 'pagfl'
coef(object, ...)
## S3 method for class 'pagfl'
residuals(object, ...)
## S3 method for class 'pagfl'
fitted(object, ...)
Arguments
formula |
a formula object describing the model to be estimated. |
data |
a |
index |
a character vector holding two strings. The first string denotes the name of the index variable identifying the cross-sectional unit |
n_periods |
the number of observed time periods |
lambda |
the tuning parameter determining the strength of the penalty term. Either a single |
method |
the estimation method. Options are
Default is |
Z |
a |
min_group_frac |
the minimum group cardinality as a fraction of the total number of individuals |
bias_correc |
logical. If |
kappa |
the a non-negative weight used to obtain the adaptive penalty weights. Default is 2. |
max_iter |
the maximum number of iterations for the ADMM estimation algorithm. Default is |
tol_convergence |
the tolerance limit for the stopping criterion of the iterative ADMM estimation algorithm. Default is |
tol_group |
the tolerance limit for within-group differences. Two individuals |
rho |
the tuning parameter balancing the fitness and penalty terms in the IC that determines the penalty parameter |
varrho |
the non-negative Lagrangian ADMM penalty parameter. For PLS, the |
verbose |
logical. If |
parallel |
logical. If |
... |
ellipsis |
x |
of class |
object |
of class |
Details
Consider the grouped panel data model
where is the scalar dependent variable,
is an individual fixed effect,
is a
vector of weakly exogenous explanatory variables, and
is a zero mean error.
The coefficient vector
is subject to the latent group pattern
with ,
and
for any
,
.
The PLS method jointly estimates the latent group structure and group-specific coefficients by minimizing the criterion
with respect to .
,
to concentrate out the individual fixed effects
.
is the penalty tuning parameter and
reflects adaptive penalty weights (see Mehrabani, 2023, eq. 2.6).
denotes the Frobenius norm.
The adaptive weights
are obtained by a preliminary individual least squares estimation.
The criterion function is minimized via an iterative alternating direction method of multipliers (ADMM) algorithm (see Mehrabani, 2023, sec. 5.1).
PGMM employs a set of instruments to control for endogenous regressors. Using PGMM,
is estimated by minimizing
are obtained by an initial GMM estimation.
gives the first differences operator
.
represents a data-driven
weight matrix. I refer to Mehrabani (2023, eq. 2.10) for more details.
Again, the criterion function is minimized using an efficient ADMM algorithm (Mehrabani, 2023, sec. 5.2).
Two individuals are assigned to the same group if , where
is determined by
tol_group
. Subsequently, the number of groups follows as the number of distinct elements in . Given an estimated group structure, it is straightforward to obtain post-Lasso estimates using group-wise least squares or GMM (see
grouped_plm
).
We recommend identifying a suitable parameter by passing a logarithmically spaced grid of candidate values with a lower limit close to 0 and an upper limit that leads to a fully homogeneous panel. A BIC-type information criterion then selects the best fitting
value.
Value
An object of class pagfl
holding
model |
a |
coefficients |
a |
groups |
a |
residuals |
a vector of residuals of the demeaned model, |
fitted |
a vector of fitted values of the demeaned model, |
args |
a |
IC |
a |
convergence |
a |
call |
the function call. |
A pagfl
object has print
, summary
, fitted
, residuals
, formula
, df.residual
, and coef
S3 methods.
Author(s)
Paul Haimerl
References
Dhaene, G., & Jochmans, K. (2015). Split-panel jackknife estimation of fixed-effect models. The Review of Economic Studies, 82(3), 991-1030. doi:10.1093/restud/rdv007. Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.
Examples
# Simulate a panel with a group structure
sim <- sim_DGP(N = 20, n_periods = 80, p = 2, n_groups = 3)
y <- sim$y
X <- sim$X
df <- cbind(y = c(y), X)
# Run the PAGFL procedure
estim <- pagfl(y ~ ., data = df, n_periods = 80, lambda = 0.5, method = "PLS")
summary(estim)
# Lets pass a panel data set with explicit cross-sectional and time indicators
i_index <- rep(1:20, each = 80)
t_index <- rep(1:80, 20)
df <- data.frame(y = c(y), X, i_index = i_index, t_index = t_index)
estim <- pagfl(
y ~ ., data = df, index = c("i_index", "t_index"), lambda = 0.5, method = "PLS"
)
summary(estim)