PAGFL {PAGFL}R Documentation

Apply the Pairwise Adaptive Group Fused Lasso

Description

The pairwise adaptive group fused lasso (PAGFL) by Mehrabani (2023) jointly estimates the latent group structure and group-specific slope parameters in a panel data model. It can handle static and dynamic panels, either with or without endogenous regressors.

Usage

PAGFL(
  y,
  X,
  n_periods,
  lambda,
  method = "PLS",
  Z = NULL,
  min_group_frac = 0.05,
  bias_correc = FALSE,
  kappa = 2,
  max_iter = 2000,
  tol_convergence = 0.001,
  tol_group = sqrt(p/(sqrt(N * n_periods) * log(log(N * n_periods)))),
  rho = 0.07 * log(N * n_periods)/sqrt(N * n_periods),
  varrho = max(sqrt(5 * N * n_periods * p)/log(N * n_periods * p) - 7, 1),
  verbose = TRUE
)

Arguments

y

a NT \times 1 vector or data.frame of the dependent variable, with \bold{y}=(y_1, \dots, y_N)^\prime, y_i = (y_{i1}, \dots, y_{iT})^\prime and the scalar y_{it}.

X

a NT \times p matrix or data.frame of explanatory variables, with \bold{X}=(x_1, \dots, x_N)^\prime, x_i = (x_{i1}, \dots, x_{iT})^\prime and the p \times 1 vector x_{it}.

n_periods

the number of observed time periods T.

lambda

the tuning parameter governing the strength of the penalty term. Either a single \lambda or a vector of candidate values can be passed. If a vector is supplied, a BIC-type information criterion selects the best fitting parameter value.

method

the estimation method. Options are

'PLS'

for using the penalized least squares (PLS) algorithm. We recommend PLS in case of (weakly) exogenous regressors (Mehrabani, 2023, sec. 2.2).

'PGMM'

for using the penalized Generalized Method of Moments (PGMM). PGMM is required when instrumenting endogenous regressors (Mehrabani, 2023, sec. 2.3). A matrix Z contains the necessary exogenous instruments.

Default is 'PLS'.

Z

a NT \times q matrix of exogenous instruments, where q \geq p, \bold{Z}=(z_1, \dots, z_N)^\prime, z_i = (z_{i1}, \dots, z_{iT})^\prime and z_{it} is a q \times 1 vector. Z is only required when method = 'PGMM' is selected. When using 'PLS', either pass NULL or any matrix \bold{Z} is disregarded. Default is NULL.

min_group_frac

the minimum group size as a fraction of the total number of individuals N. In case a group falls short of this threshold, a hierarchical classifier allocates its members to the remaining groups. Default is 0.05.

bias_correc

logical. If TRUE, a Split-panel Jackknife bias correction following Dhaene and Jochmans (2015) is applied to the slope parameters. We recommend using this correction when facing a dynamic panel. Default is FALSE.

kappa

the weight placed on the adaptive penalty weights. Default is 2.

max_iter

the maximum number of iterations for the ADMM estimation algorithm. Default is 2000.

tol_convergence

the tolerance limit for the stopping criterion of the iterative ADMM estimation algorithm. Default is 0.001.

tol_group

the tolerance limit for within-group differences. Two individuals are placed in the same group if the Frobenius norm of their coefficient parameter difference is below this parameter. If left unspecified, the heuristic \sqrt{\frac{p}{\sqrt{NT} \log(\log(NT))}} is used. We recommend the default.

rho

the tuning parameter balancing the fitness and penalty terms in the information criterion that determines the penalty parameter \lambda. If left unspecified, the heuristic \rho = 0.07 \frac{\sqrt{NT} \log(NT)}{NT} of Mehrabani (2023, sec. 6) is used. We recommend the default.

varrho

the non-negative Lagrangian ADMM penalty parameter. For PLS, the \varrho value is trivial. However, for PGMM, small values lead to slow convergence of the algorithm. If left unspecified, the default heuristic \varrho = \max(\frac{\sqrt{5NTp}}{\log(NTp)}-7, 1) is used.

verbose

logical. If TRUE, a progression bar is printed when iterating over candidate \lambda values and helpful warning messages are shown. Default is TRUE.

Details

The PLS method minimizes the following criterion:

\frac{1}{T} \sum^N_{i=1} \sum^{T}_{t=1}(\tilde{y}_{it} - \beta^\prime_i \tilde{x}_{it})^2 + \frac{\lambda}{N} \sum_{1 \leq i} \sum_{i<j \leq N} \dot{w}_{ij} \| \beta_i - \beta_j \|,

where \tilde{y}_{it} is the de-meaned dependent variable, \tilde{x}_{it} represents a vector of de-meaned weakly exogenous explanatory variables, \lambda is the penalty tuning parameter and \dot{w}_{ij} reflects adaptive penalty weights (see Mehrabani, 2023, eq. 2.6). \| \cdot \| denotes the Frobenius norm. The adaptive weights \dot{w}_{ij} are obtained by a preliminary least squares estimation. The solution \hat{\beta} is computed via an iterative alternating direction method of multipliers (ADMM) algorithm (see Mehrabani, 2023, sec. 5.1).

PGMM employs a set of instruments Z to control for endogenous regressors. Using PGMM, \bold{\beta} = (\beta_1^\prime, \dots, \beta_N^\prime)^\prime is estimated by minimizing:

\sum^N_{i = 1} \left[ \frac{1}{N} \sum_{t=1}^T z_{it} (\Delta y_{it} - \beta^\prime_i \Delta x_{it}) \right]^\prime W_i \left[\frac{1}{T} \sum_{t=1}^T z_{it}(\Delta y_{it} - \beta^\prime_i \Delta x_{it}) \right] + \frac{\lambda}{N} \sum_{1 \leq i} \sum_{i<j \leq N} \ddot{w}_{ij} \| \beta_i - \beta_j \|.

\ddot{w}_{ij} are obtained by an initial GMM estimation. \Delta gives the first differences operator \Delta y_{it} = y_{it} - y_{i t-1}. W_i represents a data-driven q \times q weight matrix. I refer to Mehrabani (2023, eq. 2.10) for more details. \bold{\beta} is again estimated employing an efficient ADMM algorithm (Mehrabani, 2023, sec. 5.2).

Two individuals are assigned to the same group if \| \hat{\beta}_i - \hat{\beta}_j \| \leq \epsilon_{\text{tol}}, where \epsilon_{\text{tol}} is given by tol_group.

We suggest identifying a suitable \lambda parameter by passing a logarithmically spaced grid of candidate values with a lower limit of 0 and an upper limit that leads to a fully homogenous panel. A BIC-type information criterion then selects the best fitting \lambda value.

Value

A list holding

IC

the BIC-type information criterion.

lambda

the penalization parameter. If multiple \lambda values were passed, the parameter yielding the lowest IC.

alpha_hat

a K \times p matrix of the post-Lasso group-specific parameter estimates.

K_hat

the estimated total number of groups.

groups_hat

a vector of estimated group memberships.

iter

the number of executed algorithm iterations.

convergence

logical. If TRUE, convergence was achieved. If FALSE, max_iter was reached.

Author(s)

Paul Haimerl

References

Dhaene, G., & Jochmans, K. (2015). Split-panel jackknife estimation of fixed-effect models. The Review of Economic Studies, 82(3), 991-1030. doi:10.1093/restud/rdv007.

Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.

Examples

# Simulate a panel with a group structure
sim <- sim_DGP(N = 50, n_periods = 80, p = 2, n_groups = 3)
y <- sim$y
X <- sim$X

# Run the PAGFL procedure for a set of candidate tuning parameter values
lambda_set <- exp(log(10) * seq(log10(1e-4), log10(10), length.out = 10))
estim <- PAGFL(y = y, X = X, n_periods = 80, lambda = lambda_set, method = 'PLS')
print(estim)

[Package PAGFL version 1.0.1 Index]