R: Local Structural Equation Models (LSEM)

lsem.estimate {sirt}

R Documentation

Local Structural Equation Models (LSEM)

Description

Local structural equation models (LSEM) are structural equation models (SEM) which are evaluated for each value of a pre-defined moderator variable (Hildebrandt et al., 2009, 2016). As in nonparametric regression models, observations near a focal point - at which the model is evaluated - obtain higher weights, far distant observations obtain lower weights. The LSEM can be specified by making use of lavaan syntax. It is also possible to specify a discretized version of LSEM in which values of the moderator are grouped and a multiple group SEM is specified. The LSEM can be tested by employing a permutation test, see lsem.permutationTest.

The function lsem.MGM.stepfunctions outputs stepwise functions for a multiple group model evaluated at a grid of focal points of the moderator, specified in moderator.grid.

The argument pseudo_weights provides an ad hoc solution to estimate an LSEM for any model which can be fitted in lavaan.

It is also possible to constrain some of the parameters along the values of the moderator in a joint estimation approach (est_joint=TRUE). Parameter names can be specified which are assumed to be invariant (in par_invariant). In addition, linear or quadratic constraints can be imposed on parameters (par_linear or par_quadratic).

Statistical inference in case of joint estimation (but also for separate estimation) can be conducted via bootstrap using the function lsem.bootstrap. Bootstrap at the level of a cluster identifier is allowed (argument cluster).

Usage

lsem.estimate(data, moderator, moderator.grid, lavmodel, type="LSEM", h=1.1, bw=NULL,
    residualize=TRUE, fit_measures=c("rmsea", "cfi", "tli", "gfi", "srmr"),
    standardized=FALSE, standardized_type="std.all", lavaan_fct="sem",
    sufficient_statistics=TRUE, pseudo_weights=0,
    sampling_weights=NULL, loc_linear_smooth=TRUE, est_joint=FALSE, par_invariant=NULL,
    par_linear=NULL, par_quadratic=NULL, partable_joint=NULL, pw_linear=1,
    pw_quadratic=1, pd=TRUE, est_DIF=FALSE, se=NULL, kernel="gaussian",
    eps=1e-08, verbose=TRUE, ...)

## S3 method for class 'lsem'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'lsem'
plot(x, parindex=NULL, ask=TRUE, ci=TRUE, lintrend=TRUE,
       parsummary=TRUE, ylim=NULL, xlab=NULL,  ylab=NULL, main=NULL,
       digits=3, ...)

lsem.MGM.stepfunctions( object, moderator.grid )

# compute local weights
lsem_local_weights(data.mod, moderator.grid, h, sampling_weights=NULL,  bw=NULL,
     kernel="gaussian")

lsem.bootstrap(object, R=100, verbose=TRUE, cluster=NULL,
     repl_design=NULL, repl_factor=NULL, use_starting_values=TRUE,
     n.core=1, cl.type="PSOCK")

Arguments

`data`	Data frame
`moderator`	Variable name of the moderator
`moderator.grid`	Focal points at which the LSEM should be evaluated. If `type="MGM"`, breaks are defined in this vector.
`lavmodel`	Specified SEM in lavaan.
`type`	Type of estimated model. The default is `type="LSEM"` which means that a local structural equation model is estimated. A multiple group model with a discretized moderator as the grouping variable can be estimated with `type="MGM"`. In this case, the breaks must be defined in `moderator.grid`.
`h`	Bandwidth factor
`bw`	Optional bandwidth parameter if `h` should not be used
`residualize`	Logical indicating whether a residualization should be applied.
`fit_measures`	Vector with names of fit measures following the labels in lavaan
`standardized`	Optional logical indicating whether standardized solution should be included as parameters in the output using the `lavaan::standardizedSolution` function. Standardized parameters are labeled as `std__`.
`standardized_type`	Type of standardization if `standardized=TRUE`. The types are described in `lavaan::standardizedSolution`.
`lavaan_fct`	String whether `lavaan::lavaan` (`lavaan_fct="lavaan"`), `lavaan::sem` (`lavaan_fct="sem"`), `lavaan::cfa` (`lavaan_fct="cfa"`) or `lavaan::growth` (`lavaan_fct="growth"`) should be used.
`sufficient_statistics`	Logical whether sufficient statistics of weighted means and covariances should be used for model fitting. This option can be set to `sufficient_statistics=FALSE` if the data contain missing values. Note that the option `sufficient_statistics=TRUE` is only valid for (approximate) missing completely at random (MCAR) data. The option can only be used for continuous data.
`pseudo_weights`	Integer defining a target sample size. Local weights are multiplied by a factor which is rounded to integers. This approach is referred as a pseudo weighting approach. For example, using `pseudo_weights=30000` implies that the sum of local weights at each focal point is `30000`.
`sampling_weights`	Optional vector of sampling weights
`loc_linear_smooth`	Logical indicating whether local linear smoothing should be used for computing sufficient statistics for means and covariances. The default is `FALSE`.
`est_joint`	Logical indicating whether LSEM should be estimated in a joint estimation approach. This options only works wih continuous data and sufficient statistics.
`par_invariant`	Vector of invariant parameters
`par_linear`	Vector of parameters with linear function
`par_quadratic`	Vector of parameters with quadratic function
`partable_joint`	User-defined parameter table if joint estimation is used (`est_joint=TRUE`).
`pw_linear`	Number of segments if piecewise linear estimation of parameters is used
`pw_quadratic`	Number of segments if piecewise quadratic estimation of parameters is used
`pd`	Logical indicating whether nearest positive definite covariance matrix should be computed if sufficient statistics are used
`est_DIF`	Logical indicating whether parameters under differential item functioning (DIF) should be additionally computed for invariant item parameters
`se`	Type of standard error used in `lavaan::lavaan`. If `NULL`, the lavaan default is used.
`kernel`	Type of kernel function. Can be `"gaussian"`, `"uniform"` or `"epanechnikov"`.
`eps`	Minimum number for weights
`verbose`	Optional logical printing information about computation progress.
`object`	Object of class `lsem`
`file`	A file name in which the summary output will be written.
`digits`	Number of digits.
`x`	Object of class `lsem`.
`parindex`	Vector of indices for parameters in plot function.
`ask`	A logical which asks for changing the graphic for each parameter.
`ci`	Logical indicating whether confidence intervals should be plotted.
`lintrend`	Logical indicating whether a linear trend should be plotted.
`parsummary`	Logical indicating whether a parameter summary should be displayed.
`ylim`	Plot parameter `ylim`. Can be a list, see Examples.
`xlab`	Plot parameter `xlab`. Can be a vector.
`ylab`	Plot parameter `ylab`. Can be a vector.
`main`	Plot parameter `main`. Can be a vector.
`...`	Further arguments to be passed to `lavaan::sem` or `lavaan::lavaan`.
`data.mod`	Observed values of the moderator
`R`	Number of bootstrap samples
`cluster`	Optional variable name for bootstrap at the level of a cluster identifier
`repl_design`	Optional matrix containing replication weights for computation of standard errors. Note that sampling weights have to be already included in `repl_design`.
`repl_factor`	Replication factor in variance formula for statistical inference, e.g., 0.05 in PISA.
`use_starting_values`	Logical indicating whether starting values should be used from the original sample
`n.core`	A scalar indicating the number of cores that should be used.
`cl.type`	The cluster type. Default value is `"PSOCK"`. Posix machines (Linux, Mac) generally benefit from much faster cluster computation if type is set to `type="FORK"`.

Value

List with following entries

`parameters`	Data frame with all parameters estimated at focal points of moderator. Bias-corrected estimates under boostrap can be found in the column `est_bc`.
`weights`	Data frame with weights at each focal point
`parameters_summary`	Summary table for estimated parameters
`parametersM`	Estimated parameters in matrix form. Parameters are in columns and values of the grid of the moderator are in rows.
`bw`	Used bandwidth
`h`	Used bandwidth factor
`N`	Sample size
`moderator.density`	Estimated frequencies and effective sample size for moderator at focal points
`moderator.stat`	Descriptive statistics for moderator
`moderator`	Variable name of moderator
`moderator.grid`	Used grid of focal points for moderator
`moderator.grouped`	Data frame with informations about grouping of moderator if `type="MGM"`.
`residualized.intercepts`	Estimated intercept functions used for residualization.
`lavmodel`	Used lavaan model
`data`	Used data frame, possibly residualized if `residualize=TRUE`
`model_parameters`	Model parameters in LSEM
`parameters_boot`	Parameter values in each bootstrap sample (for `lsem.bootstrap`)
`fitstats_joint_boot`	Fit statistics in each bootstrap sample (for `lsem.bootstrap`)
`dif_effects`	Estimated item parameters under DIF

Author(s)

Alexander Robitzsch, Oliver Luedtke, Andrea Hildebrandt

References

Hildebrandt, A., Luedtke, O., Robitzsch, A., Sommer, C., & Wilhelm, O. (2016). Exploring factor model parameters across continuous variables with local structural equation models. Multivariate Behavioral Research, 51(2-3), 257-278. doi:10.1080/00273171.2016.1142856

Hildebrandt, A., Wilhelm, O., & Robitzsch, A. (2009). Complementary and competing factor analytic approaches for the investigation of measurement invariance. Review of Psychology, 16, 87-102.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.lsem01 | Age differentiation
#############################################################################

data(data.lsem01, package="sirt")
dat <- data.lsem01

# specify lavaan model
lavmodel <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F"

# define grid of moderator variable age
moderator.grid <- seq(4,23,1)

#********************************
#*** Model 1: estimate LSEM with bandwidth 2
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE)
summary(mod1)
plot(mod1, parindex=1:5)

# perform permutation test for Model 1
pmod1 <- sirt::lsem.permutationTest( mod1, B=10 )
          # only for illustrative purposes the number of permutations B is set
          # to a low number of 10
summary(pmod1)
plot(pmod1, type="global")

#* perform permutation test with parallel computation
pmod1a <- sirt::lsem.permutationTest( mod1, B=10, n.core=3 )
summary(pmod1a)

#** estimate Model 1 based on pseudo weights
mod1b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE, pseudo_weights=50 )
summary(mod1b)

#** estimation with sampling weights

# generate random sampling weights
set.seed(987)
weights <- stats::runif(nrow(dat), min=.4, max=3 )
mod1c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, sampling_weights=weights)
summary(mod1c)

#********************************
#*** Model 2: estimate multiple group model with 4 age groups

# define breaks for age groups
moderator.grid <- seq( 3.5, 23.5, len=5) # 4 groups
# estimate model
mod2 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
           lavmodel=lavmodel, type="MGM", std.lv=TRUE)
summary(mod2)

# output step functions
smod2 <- sirt::lsem.MGM.stepfunctions( object=mod2, moderator.grid=seq(4,23,1) )
str(smod2)

#********************************
#*** Model 3: define standardized loadings as derived variables

# specify lavaan model
lavmodel <- "
        F=~ a1*v1+a2*v2+a3*v3+a4*v4
        v1 ~~ s1*v1
        v2 ~~ s2*v2
        v3 ~~ s3*v3
        v4 ~~ s4*v4
        F ~~ 1*F
        # standardized loadings
        l1 :=a1 / sqrt(a1^2 + s1 )
        l2 :=a2 / sqrt(a2^2 + s2 )
        l3 :=a3 / sqrt(a3^2 + s3 )
        l4 :=a4 / sqrt(a4^2 + s4 )
        "
# estimate model
mod3 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE)
summary(mod3)
plot(mod3)

#********************************
#*** Model 4: estimate LSEM and automatically include standardized solutions

lavmodel <- "
        F=~ 1*v1+v2+v3+v4
        F ~~ F"
mod4 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE)
summary(mod4)
# permutation test (use only few permutations for testing purposes)
pmod1 <- sirt::lsem.permutationTest( mod4, B=3 )

#**** compute LSEM local weights
wgt <- sirt::lsem_local_weights(data.mod=dat$age, moderator.grid=moderator.grid,
             h=2)$weights
print(str(weights))

#********************************
#*** Model 5: invariance parameter constraints and other constraints

lavmodel <- "
        F=~ 1*v1+v2+v3+v4
        F ~~ F"
moderator.grid <- seq(4,23,4)

#- estimate model without constraints
mod5a <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE)
summary(mod5a)
# extract parameter names
mod5a$model_parameters

#- invariance constraints on residual variances
par_invariant <- c("F=~v2","v2~~v2")
mod5b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE, par_invariant=par_invariant)
summary(mod5b)

#- bootstrap for statistical inference
bmod5b <- sirt::lsem.bootstrap(mod5b, R=100)
# inspect parameter values and standard errors
bmod5b$parameters

#- bootstrap using parallel computing (i.e., multiple cores)
bmod5ba <- sirt::lsem.bootstrap(mod5b, R=100, n.core=3)

#- user-defined replication design
R <- 100    # bootstrap samples
N <- nrow(dat)
repl_design <- matrix(0, nrow=N, ncol=R)
for (rr in 1:R){
    indices <- sort( sample(1:N, replace=TRUE) )
    repl_design[,rr] <- sapply(1:N, FUN=function(ii){ sum(indices==ii) } )
}
head(repl_design)
bmod5b1 <- sirt::lsem.bootstrap(mod5a, repl_design=repl_design, repl_factor=1/R)

#- compare model mod5b with joint estimation without constraints
mod5c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE, est_joint=TRUE)
summary(mod5c)

#- linear and quadratic functions
par_invariant <- c("F=~v1","v2~~v2")
par_linear <- c("v1~~v1")
par_quadratic <- c("v4~~v4")

mod5d <- sirt::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid,
            lavmodel=lavmodel, h=2, par_invariant=par_invariant, par_linear=par_linear,
            par_quadratic=par_quadratic)
summary(mod5d)

#- user-defined constraints: step functions for parameters

# inspect parameter table (from lavaan) of fitted model
pj <- mod5d$partable_joint
#* modify parameter table for user-defined constraints
# define step function for F=~v1 which is constant on intervals 1:4 and 5:7
pj2 <- pj[ pj$con==1, ]
pj2[ c(5,6), "lhs" ] <- "p1g5"
pj2 <- pj2[ -4, ]
partable_joint <- rbind(pj1, pj2)
# estimate model with constraints
mod5e <- lsem::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid,
             lavmodel=lavmodel, h=2, std.lv=TRUE, estimator="ML",
             partable_joint=partable_joint)
summary(mod5e)

#############################################################################
# EXAMPLE 2: data.lsem01 | FIML with missing data
#############################################################################

data(data.lsem01)
dat <- data.lsem01
# induce artifical missing values
set.seed(98)
dat[ stats::runif(nrow(dat)) < .5, c("v1")] <- NA
dat[ stats::runif(nrow(dat)) < .25, c("v2")] <- NA

# specify lavaan model
lavmodel1 <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F"

# define grid of moderator variable age
moderator.grid <- seq(4,23,2)

#*** estimate LSEM with FIML
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
                lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="ML", missing="fiml")
summary(mod1)

#############################################################################
# EXAMPLE 3: data.lsem01 | WLSMV estimation
#############################################################################

data(data.lsem01)
dat <- data.lsem01

# create artificial dichotomous data
for (vv in 2:6){
dat[,vv] <- 1*(dat[,vv] > mean(dat[,vv]))
}

# specify lavaan model
lavmodel1 <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F
        v1 | t1
        v2 | t1
        v3 | t1
        v4 | t1
        v5 | t1
        "

# define grid of moderator variable age
moderator.grid <- seq(4,23,2)

#*** local WLSMV estimation
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
          lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="DWLS", ordered=paste0("v",1:5),
          residualize=FALSE, pseudo_weights=10000, parameterization="THETA" )
summary(mod1)

## End(Not run)

[Package sirt version 4.1-15 Index]