tv_pagfl {PAGFL}R Documentation

Time-varying Pairwise Adaptive Group Fused Lasso

Description

Estimate a time-varying panel data model with a latent group structure using the pairwise adaptive group fused lasso (time-varying PAGFL). The time-varying PAGFL jointly identifies the latent group structure and group-specific time-varying functional coefficients. The time-varying coefficients are modeled as polynomial B-splines. The function supports both static and dynamic panel data models.

Usage

tv_pagfl(
  formula,
  data,
  index = NULL,
  n_periods = NULL,
  lambda,
  d = 3,
  M = floor(length(y)^(1/7) - log(p)),
  min_group_frac = 0.05,
  const_coef = NULL,
  kappa = 2,
  max_iter = 50000,
  tol_convergence = 1e-10,
  tol_group = 0.001,
  rho = 0.04 * log(N * n_periods)/sqrt(N * n_periods),
  varrho = 1,
  verbose = TRUE,
  parallel = TRUE,
  ...
)

## S3 method for class 'tvpagfl'
summary(object, ...)

## S3 method for class 'tvpagfl'
formula(x, ...)

## S3 method for class 'tvpagfl'
df.residual(object, ...)

## S3 method for class 'tvpagfl'
print(x, ...)

## S3 method for class 'tvpagfl'
coef(object, ...)

## S3 method for class 'tvpagfl'
residuals(object, ...)

## S3 method for class 'tvpagfl'
fitted(object, ...)

Arguments

formula

a formula object describing the model to be estimated.

data

a data.frame or matrix holding a panel data set. If no index variables are provided, the panel must be balanced and ordered in the long format Y=(Y1,,YN)\bold{Y}=(Y_1^\prime, \dots, Y_N^\prime)^\prime, Yi=(Yi1,,YiT)Y_i = (Y_{i1}, \dots, Y_{iT})^\prime with Yit=(yit,xit)Y_{it} = (y_{it}, x_{it}^\prime)^\prime. Conversely, if data is not ordered or not balanced, data must include two index variables that declare the cross-sectional unit ii and the time period tt of each observation.

index

a character vector holding two strings. The first string denotes the name of the index variable identifying the cross-sectional unit ii, and the second string represents the name of the variable declaring the time period tt. In case of a balanced panel data set that is ordered in the long format, index can be left empty if the the number of time periods n_periods is supplied.

n_periods

the number of observed time periods TT. If an index character vector is passed, this argument can be left empty. Default is Null.

lambda

the tuning parameter determining the strength of the penalty term. Either a single λ\lambda or a vector of candidate values can be passed. If a vector is supplied, a BIC-type IC automatically selects the best fitting λ\lambda value.

d

the polynomial degree of the B-splines. Default is 3.

M

the number of interior knots of the B-splines. If left unspecified, the default heuristic M=floor((NT)17log(p))M = \text{floor}((NT)^{\frac{1}{7}} - \log(p)) is used. Note that MM does not include the boundary knots and the entire sequence of knots is of length M+d+1M + d + 1.

min_group_frac

the minimum group cardinality as a fraction of the total number of individuals NN. In case a group falls short of this threshold, each of its members is allocated to one of the remaining groups according to the MSE. Default is 0.05.

const_coef

a character vector containing the variable names of explanatory variables that enter with time-constant coefficients.

kappa

the a non-negative weight used to obtain the adaptive penalty weights. Default is 2.

max_iter

the maximum number of iterations for the ADMM estimation algorithm. Default is 51045*10^4.

tol_convergence

the tolerance limit for the stopping criterion of the iterative ADMM estimation algorithm. Default is 110101*10^{-10}.

tol_group

the tolerance limit for within-group differences. Two individuals are assigned to the same group if the Frobenius norm of their coefficient vector difference is below this threshold. Default is 11031*10^{-3}.

rho

the tuning parameter balancing the fitness and penalty terms in the IC that determines the penalty parameter λ\lambda. If left unspecified, the heuristic ρ=0.07log(NT)NT\rho = 0.07 \frac{\log(NT)}{\sqrt{NT}} of Mehrabani (2023, sec. 6) is used. We recommend the default.

varrho

the non-negative Lagrangian ADMM penalty parameter. For the employed penalized sieve estimation PSE, the ϱ\varrho value is trivial. We recommend the default 1.

verbose

logical. If TRUE, helpful warning messages are shown. Default is TRUE.

parallel

logical. If TRUE, certain operations are parallelized across multiple cores. Default is TRUE.

...

ellipsis

object

of class tvpagfl.

x

of class tvpagfl.

Details

Consider the grouped time-varying panel data model

yit=γi+βi(t/T)xit+ϵit,i=1,,N,  t=1,,T,y_{it} = \gamma_i + \beta^\prime_{i} (t/T) x_{it} + \epsilon_{it}, \quad i = 1, \dots, N, \; t = 1, \dots, T,

where yity_{it} is the scalar dependent variable, γi\gamma_i is an individual fixed effect, xitx_{it} is a p×1p \times 1 vector of explanatory variables, and ϵit\epsilon_{it} is a zero mean error. The coefficient vector βi(t/T)\beta_{i} (t/T) is subject to the latent group pattern

βi(tT)=k=1Kαk(tT)1{iGk},\beta_i \left(\frac{t}{T} \right) = \sum_{k = 1}^K \alpha_k \left( \frac{t}{T} \right) \bold{1} \{i \in G_k \},

with k=1KGk={1,,N}\cup_{k = 1}^K G_k = \{1, \dots, N\}, GkGj=G_k \cap G_j = \emptyset and αkαj0\| \alpha_k - \alpha_j \| \neq 0 for any kjk \neq j, k=1,,Kk = 1, \dots, K.

The time-varying coefficient functions are estimated as polynomial B-splines using the penalized sieve-technique. To this end, let B(v)B(v) denote a M+d+1M + d +1 vector basis functions, where dd denotes the polynomial degree and MM the number of interior knots. Then, βi(t/T)\beta_{i}(t/T) and αk(t/T)\alpha_{k}(t/T) are approximated by forming linear combinations of the basis functions βi(t/T)πiB(t/T)\beta_{i} (t/T) \approx \pi_i^\prime B(t/T) and αi(t/T)ξkB(t/T)\alpha_{i}(t/T) \approx \xi_k^\prime B(t/T), where πi\pi_i and ξi\xi_i are (M+d+1)×p(M + d + 1) \times p coefficient matrices.

The explanatory variables are projected onto the spline basis system, which results in the (M+d+1)p×1(M + d + 1)p \times 1 vector zit=xitB(v)z_{it} = x_{it} \otimes B(v). Subsequently, the DGP can be reformulated as

yit=γi+zitvec(πi)+uit,y_{it} = \gamma_i + z_{it}^\prime \text{vec}(\pi_{i}) + u_{it},

where uit=ϵit+ηitu_{it} = \epsilon_{it} + \eta_{it} and ηit\eta_{it} reflects a sieve approximation error. We refer to Su et al. (2019, sec. 2) for more details on the sieve technique.

Inspired by Su et al. (2019) and Mehrabani (2023), the time-varying PAGFL jointly estimates the functional coefficients and the group structure by minimizing the criterion

QNT(π,λ)=1NTi=1Nt=1T(y~itz~itvec(πi))2+λNi=1N1j>iNω˙ijπiπjQ_{NT} (\bold{\pi}, \lambda) = \frac{1}{NT} \sum^N_{i=1} \sum^{T}_{t=1}(\tilde{y}_{it} - \tilde{z}_{it}^\prime \text{vec}(\pi_{i}))^2 + \frac{\lambda}{N} \sum_{i = 1}^{N - 1} \sum_{j > i}^N \dot{\omega}_{ij} \| \pi_i - \pi_j \|

with respect to π=(vec(πi),,vec(πN))\bold{\pi} = (\text{vec}(\pi_i)^\prime, \dots, \text{vec}(\pi_N)^\prime)^\prime. a~it=aitT1t=1Tait\tilde{a}_{it} = a_{it} - T^{-1} \sum^{T}_{t=1} a_{it}, a={y,z}a = \{y, z\} to concentrate out the individual fixed effects γi\gamma_i. λ\lambda is the penalty tuning parameter and w˙ij\dot{w}_{ij} denotes adaptive penalty weights which are obtained by a preliminary non-penalized estimation. \| \cdot \| represents the Frobenius norm. The solution criterion function is minimized via the iterative alternating direction method of multipliers (ADMM) algorithm proposed by Mehrabani (2023, sec. 5.1).

Two individuals are assigned to the same group if vec(π^iπ^j)ϵtol\| \text{vec} (\hat{\pi}_i - \hat{\pi}_j) \| \leq \epsilon_{\text{tol}}, where ϵtol\epsilon_{\text{tol}} is determined by tol_group. Subsequently, the number of groups follows as the number of distinct elements in π^\hat{\bold{\pi}}. Given an estimated group structure, it is straightforward to obtain post-Lasso estimates ξ^\hat{\bold{\xi}} using group-wise least squares (see grouped_tv_plm).

We recommend identifying a suitable λ\lambda parameter by passing a logarithmically spaced grid of candidate values with a lower limit close to 0 and an upper limit that leads to a fully homogeneous panel. A BIC-type information criterion then selects the best fitting λ\lambda value.

In case of an unbalanced panel data set, the earliest and latest available observations per group define the start and end-points of the interval on which the group-specific time-varying coefficients are defined.

Value

An object of class tvpagfl holding

model

a data.frame containing the dependent and explanatory variables as well as cross-sectional and time indices,

coefficients

let p(1)p^{(1)} denote the number of time-varying coefficients and p(2)p^{(2)} the number of time constant parameters. A list holding (i) a T×p(1)×K^T \times p^{(1)} \times \hat{K} array of the post-Lasso group-specific functional coefficients and (ii) a K×p(2)K \times p^{(2)} matrix of time-constant post-Lasso estimates.

groups

a list containing (i) the total number of groups K^\hat{K} and (ii) a vector of estimated group memberships (g^1,,g^N)(\hat{g}_1, \dots, \hat{g}_N), where g^i=k\hat{g}_i = k if ii is assigned to group kk,

residuals

a vector of residuals of the demeaned model,

fitted

a vector of fitted values of the demeaned model,

args

a list of additional arguments,

IC

a list containing (i) the value of the IC, (ii) the employed tuning parameter λ\lambda, and (iii) the MSE,

convergence

a list containing (i) a logical variable if convergence was achieved and (ii) the number of executed ADMM algorithm iterations,

call

the function call.

An object of class tvpagfl has print, summary, fitted, residuals, formula, df.residual and coef S3 methods.

Author(s)

Paul Haimerl

References

Mehrabani, A. (2023). Estimation and identification of latent group structures in panel data. Journal of Econometrics, 235(2), 1464-1482. doi:10.1016/j.jeconom.2022.12.002.

Su, L., Wang, X., & Jin, S. (2019). Sieve estimation of time-varying panel data models with latent structures. Journal of Business & Economic Statistics, 37(2), 334-349. doi:10.1080/07350015.2017.1340299.

Examples

# Simulate a time-varying panel with a trend and a group pattern
set.seed(1)
sim <- sim_tv_DGP(N = 10, n_periods = 50, intercept = TRUE, p = 1)
df <- data.frame(y = c(sim$y))

# Run the time-varying PAGFL
estim <- tv_pagfl(y ~ ., data = df, n_periods = 50, lambda = 10, parallel = FALSE)
summary(estim)


[Package PAGFL version 1.1.1 Index]