plsmm_lasso {plsmmLasso} | R Documentation |
Fit a high-dimensional PLSMM
Description
Fits a partial linear semiparametric mixed effects model (PLSMM) via penalized maximum likelihood.
Usage
plsmm_lasso(
x,
y,
series,
t,
name_group_var = NULL,
bases,
gamma,
lambda,
timexgroup,
criterion,
nonpara = FALSE,
cvg_tol = 0.001,
max_iter = 100,
verbose = FALSE
)
Arguments
x |
A matrix of predictor variables. |
y |
A continuous vector of response variable. |
series |
A variable representing different series or groups in the data modeled as a random intercept. |
t |
A numeric vector indicating the timepoints. |
name_group_var |
A character string specifying the name of the grouping variable in the |
bases |
A matrix of bases functions. |
gamma |
The regularization parameter for the nonlinear effect of time. |
lambda |
The regularization parameter for the fixed effects. |
timexgroup |
Logical indicating whether to use a time-by-group interaction.
If |
criterion |
The information criterion to be computed. Options are "BIC", "BICC", or "EBIC". |
nonpara |
Logical. If TRUE, the |
cvg_tol |
Convergence tolerance for the algorithm. |
max_iter |
Maximum number of iterations allowed for convergence. |
verbose |
Logical indicating whether to print convergence details at each iteration. Default is |
Details
This function fits a PLSMM with a lasso penalty on the fixed effects and the coefficient associated with the bases functions. It uses the Expectation-Maximization (EM) algorithm for estimation. The bases functions represent a nonlinear effect of time.
The model includes a random intercept for each level of the variable specified by series
. Additionally, if timexgroup
is
set to TRUE
, the model includes a time-by-group interaction, allowing each group of name_group_var
to have its own estimate
of the nonlinear function, which can capture group-specific nonlinearities over time. If name_group_var
is set to NULL
only
one nonlinear function for the whole data is being used
The algorithm iteratively updates the estimates until convergence or until the maximum number of iterations is reached.
Value
A list containing the following components:
lasso_output |
A list with the fitted values for the fixed effect and nonlinear effect. The estimated coeffcients for the fixed effects and nonlinear effect. The indices of the used bases functions. |
se |
Estimated standard deviation of the residuals. |
su |
Estimated standard deviation of the random intercept. |
out_phi |
Data frame containing the estimated individual random intercept. |
ni |
Number of timepoitns per observations. |
hyperparameters |
Data frame with lambda and gamma values. |
converged |
Logical indicating if the algorithm converged. |
crit |
Value of the selected information criterion. |
Examples
set.seed(123)
data_sim <- simulate_group_inter(
N = 50, n_mvnorm = 3, grouped = TRUE,
timepoints = 3:5, nonpara_inter = TRUE,
sample_from = seq(0, 52, 13),
cos = FALSE, A_vec = c(1, 1.5)
)
sim <- data_sim$sim
x <- as.matrix(sim[, -1:-3])
y <- sim$y
series <- sim$series
t <- sim$t
bases <- create_bases(t)
lambda <- 0.0046
gamma <- 0.00000001
plsmm_output <- plsmm_lasso(x, y, series, t,
name_group_var = "group", bases$bases,
gamma = gamma, lambda = lambda, timexgroup = TRUE,
criterion = "BIC"
)
# fixed effect coefficients
plsmm_output$lasso_output$theta
# fixed effect fitted values
plsmm_output$lasso_output$x_fit
# nonlinear functions coefficients
plsmm_output$lasso_output$alpha
# nonlinear functions fitted values
plsmm_output$lasso_output$out_f
# standard deviation of residuals
plsmm_output$se
# standard deviation of random intercept
plsmm_output$su
# series specific random intercept
plsmm_output$out_phi