regsem {regsem} | R Documentation |
Regularized Structural Equation Modeling. Tests a single penalty. For testing multiple penalties, see cv_regsem().
Description
Regularized Structural Equation Modeling. Tests a single penalty. For testing multiple penalties, see cv_regsem().
Usage
regsem(
model,
lambda = 0,
alpha = 0.5,
gamma = 3.7,
type = "lasso",
dual_pen = NULL,
random.alpha = 0.5,
data = NULL,
optMethod = "rsolnp",
estimator = "ML",
gradFun = "none",
hessFun = "none",
prerun = FALSE,
parallel = "no",
Start = "lavaan",
subOpt = "nlminb",
longMod = FALSE,
pars_pen = "regressions",
diff_par = NULL,
LB = -Inf,
UB = Inf,
par.lim = c(-Inf, Inf),
block = TRUE,
full = TRUE,
calc = "normal",
max.iter = 500,
tol = 1e-05,
round = 3,
solver = FALSE,
quasi = FALSE,
solver.maxit = 5,
alpha.inc = FALSE,
line.search = FALSE,
step = 0.1,
momentum = FALSE,
step.ratio = FALSE,
nlminb.control = list(),
missing = "listwise"
)
Arguments
model |
Lavaan output object. This is a model that was previously run with any of the lavaan main functions: cfa(), lavaan(), sem(), or growth(). It also can be from the efaUnrotate() function from the semTools package. Currently, the parts of the model which cannot be handled in regsem is the use of multiple group models, missing other than listwise, thresholds from categorical variable models, the use of additional estimators other than ML, most notably WLSMV for categorical variables. Note: the model does not have to actually run (use do.fit=FALSE), converge etc... regsem() uses the lavaan object as more of a parser and to get sample covariance matrix. |
lambda |
Penalty value. Note: higher values will result in additional
convergence issues. If using values > 0.1, it is recommended to use
mutli_optim() instead. See |
alpha |
Mixture for elastic net. 1 = ridge, 0 = lasso |
gamma |
Additional penalty for MCP and SCAD |
type |
Penalty type. Options include "none", "lasso", "enet" for the elastic net, "alasso" for the adaptive lasso and "diff_lasso". If ridge penalties are desired, use type="enet" and alpha=1. diff_lasso penalizes the discrepency between parameter estimates and some pre-specified values. The values to take the deviation from are specified in diff_par. Two methods for sparser results than lasso are the smooth clipped absolute deviation, "scad", and the minimum concave penalty, "mcp". Last option is "rlasso" which is the randomised lasso to be used for stability selection. |
dual_pen |
Two penalties to be used for type="dual", first is lasso, second ridge |
random.alpha |
Alpha parameter for randomised lasso. Has to be between 0 and 1, with a default of 0.5. Note this is only used for "rlasso", which pairs with stability selection. |
data |
Optional dataframe. Only required for missing="fiml" which is not currently working. |
optMethod |
Solver to use. Two main options for use: rsoolnp and coord_desc. Although slightly slower, rsolnp works much better for complex models. coord_desc uses gradient descent with soft thresholding for the type of of penalty. Rsolnp is a nonlinear solver that doesn't rely on gradient information. There is a similar type of solver also available for use, slsqp from the nloptr package. coord_desc can also be used with hessian information, either through the use of quasi=TRUE, or specifying a hess_fun. However, this option is not recommended at this time. |
estimator |
Whether to use maximum likelihood (ML) or unweighted least squares (ULS) as a base estimator. |
gradFun |
Gradient function to use. Recommended to use "ram", which refers to the method specified in von Oertzen & Brick (2014). Only for use with optMethod="coord_desc". |
hessFun |
Hessian function to use. Recommended to use "ram", which refers to the method specified in von Oertzen & Brick (2014). This is currently not recommended. |
prerun |
Logical. Use rsolnp to first optimize before passing to gradient descent? Only for use with coord_desc. |
parallel |
Logical. Whether to parallelize the processes? |
Start |
type of starting values to use. Only recommended to use "default". This sets factor loadings and variances to 0.5. Start = "lavaan" uses the parameter estimates from the lavaan model object. This is not recommended as it can increase the chances in getting stuck at the previous parameter estimates. |
subOpt |
Type of optimization to use in the optimx package. |
longMod |
If TRUE, the model is using longitudinal data? This changes the sample covariance used. |
pars_pen |
Parameter indicators to penalize. There are multiple ways to specify. The default is to penalize all regression parameters ("regressions"). Additionally, one can specify all loadings ("loadings"), or both c("regressions","loadings"). Next, parameter labels can be assigned in the lavaan syntax and passed to pars_pen. See the example.Finally, one can take the parameter numbers from the A or S matrices and pass these directly. See extractMatrices(lav.object)$A. |
diff_par |
Parameter values to deviate from. Only used when type="diff_lasso". |
LB |
lower bound vector. Note: This is very important to specify when using regularization. It greatly increases the chances of converging. |
UB |
Upper bound vector |
par.lim |
Vector of minimum and maximum parameter estimates. Used to stop optimization and move to new starting values if violated. |
block |
Whether to use block coordinate descent |
full |
Whether to do full gradient descent or block |
calc |
Type of calc function to use with means or not. Not recommended for use. |
max.iter |
Number of iterations for coordinate descent |
tol |
Tolerance for coordinate descent |
round |
Number of digits to round results to |
solver |
Whether to use solver for coord_desc |
quasi |
Whether to use quasi-Newton |
solver.maxit |
Max iterations for solver in coord_desc |
alpha.inc |
Whether alpha should increase for coord_desc |
line.search |
Use line search for optimization. Default is no, use fixed step size |
step |
Step size |
momentum |
Momentum for step sizes |
step.ratio |
Ratio of step size between A and S. Logical |
nlminb.control |
list of control values to pass to nlminb |
missing |
How to handle missing data. Current options are "listwise" and "fiml". "fiml" is not currently working well. |
Value
out List of return values from optimization program
convergence Convergence status. 0 = converged, 1 or 99 means the model did not converge.
par.ret Final parameter estimates
Imp_Cov Final implied covariance matrix
grad Final gradient.
KKT1 Were final gradient values close enough to 0.
KKT2 Was the final Hessian positive definite.
df Final degrees of freedom. Note that df changes with lasso penalties.
npar Final number of free parameters. Note that this can change with lasso penalties.
SampCov Sample covariance matrix.
fit Final F_ml fit. Note this is the final parameter estimates evaluated with the F_ml fit function.
coefficients Final parameter estimates
nvar Number of variables.
N sample size.
nfac Number of factors
baseline.chisq Baseline chi-square.
baseline.df Baseline degrees of freedom.
Examples
# Note that this is not currently recommended. Use cv_regsem() instead
library(lavaan)
# put variables on same scale for regsem
HS <- data.frame(scale(HolzingerSwineford1939[,7:15]))
mod <- '
f =~ 1*x1 + l1*x2 + l2*x3 + l3*x4 + l4*x5 + l5*x6 + l6*x7 + l7*x8 + l8*x9
'
# Recommended to specify meanstructure in lavaan
outt = cfa(mod, HS, meanstructure=TRUE)
fit1 <- regsem(outt, lambda=0.05, type="lasso",
pars_pen=c("l1", "l2", "l6", "l7", "l8"))
#equivalent to pars_pen=c(1:2, 6:8)
#summary(fit1)