RSAmodel.auxiliary {RSAtools}R Documentation

Estimate a list of polynomial models for RSA, using auxiliary variables in FIML

Description

Estimate any number of predefined or user-specific polynomial models, using auxiliary variables for missing data treatment by full information maximum likelihood (Graham, 2003). Based on the semTools function sem.auxiliary. Works in the same way as RSAmodel, only differing by the addition of the auxiliary variable option ("aux" option).

Usage

RSAmodel.auxiliary(
  formula,
  data = NULL,
  aux = NULL,
  center = "none",
  scale = "none",
  na.rm = FALSE,
  out.rm = TRUE,
  breakline = FALSE,
  models = c("CUBIC", "STEP1"),
  user_model = NULL,
  verbose = TRUE,
  add = "",
  estimator = "MLR",
  se = "robust",
  missing = NA,
  control.variables = NULL,
  center.control.variables = FALSE,
  sampling.weights = NULL,
  group_name = NULL,
  cluster = NULL,
  ...
)

Arguments

formula

A formula in the form z ~ x*y, specifying the variable names used from the data frame, where z is the name of the response variable, and x and y are the names of the predictor variables.

data

A data frame with the variables

aux

character. Names of auxiliary variables to add to model. Passed on internally to sem.auxiliary (semTools package)

center

Method for centering the predictor variables before the analysis. Default option ("none") applies no centering. "pooled" centers the predictor variables on their pooled sample mean. "variablewise" centers the predictor variables on their respective sample mean. You should think carefully before applying the "variablewise" option, as centering or reducing the predictor variables on common values (e.g., their grand means and SDs) can affect the commensurability of the predictor scales.

scale

Method for scaling the predictor variables before the analysis. Default option ("none") applies no scaling. "pooled" scales the predictor variables on their pooled sample SD, which preserves the commensurability of the predictor scales. "variablewise" scales the predictor variables on their respective sample SD. You should think carefully before applying the "variablewise" option, as scaling the predictor variables at different values (e.g., their respective SDs) can affect the commensurability of the predictor scales.

na.rm

Remove missings before proceeding?

out.rm

Should outliers according to Bollen & Jackman (1980) criteria be excluded from the analyses? In large data sets this analysis is the speed bottleneck. If you are sure that no outliers exist, set this option to FALSE for speed improvements.

breakline

Should the breakline in the unconstrained absolute difference model be allowed (the breakline is possible from the model formulation, but empirically rather unrealistic ...). Defaults to FALSE

models

A vector with names of all models that should be computed. Should be any from c("CUBIC","FM1_ONLYX","FM2_ONLYY","FM3_ADDITIVE","FM4_INTER","FM5_QUADX","FM6_QUADY","FM7_CONG","FM8_INCONG","FM9_CURVCONGX","FM10_CURVCONGY","FM11_CURVINCONGX","FM12_CURVINCONGY","FM13_QUADXQUADY","FM14_ROTCONG","FM15_ROTINCONG","FM16_CUBICX","FM17_CUBICY","FM18_LEVDEPQUADX","FM19_LEVDEPQUADY","FM20_ASYMCONG","FM21_ASYMINCONG","FM22_LEVDEPCONG","FM23_LEVDEPINCONG","FM24_PARALLELASYM","FM25_NONPARALLELASYM","FM26_PARALLELASYMWEAK","FM27_PARALLELASYMSTRONG","FM28_NONPARALLELASYMWEAK","FM29_NONPARALLELASYMSTRONG","FM30_ASYMCONGROTY","FM31_ASYMCONGROTX","FM32_ASYMINCONGROTY","FM33_ASYMINCONGROTX","FM34_LEVDEPCONGROTY","FM35_LEVDEPCONGROTX","FM36_LEVDEPINCONGROTY","FM37_LEVDEPINCONGROTX"). For models="STEP1", all polynomial families and the saturated cubic are computed (default), for models="USER" all user-specific models defined in the list "user_model" are computed.

user_model

A list of user-specified polynomial models, defined by setting constraints on the polynomial parameters b1 to b9, using the syntax in lavaan, for example: "b1 == 2*b2". Only parametric constraints specifications are allowed.

verbose

Should additional information during the computation process be printed?

add

Additional syntax that is added to the lavaan model. Can contain, for example, additional constraints, like "p01 == 0; p11 == 0"

estimator

Type of estimator that should be used by lavaan. Defaults to "MLR", which provides robust standard errors, a robust scaled test statistic, and can handle missing values. If you want to reproduce standard OLS estimates, use estimator="ML" and se="standard"

se

Type of standard errors. This parameter gets passed through to the sem function of the lavaan package. See options there. By default, robust SEs are computed. If you use se="boot", lavaan provides CIs and p-values based on the bootstrapped standard error. If you use confint(..., method="boot"), in contrast, you get CIs and p-values based on percentile bootstrap.

missing

Handling of missing values (this parameter is passed to the lavaan sem function). By default (missing=NA), Full Information Maximum Likelihood (FIML) is employed in case of missing values. If cases with missing values should be excluded, use missing = "listwise".

control.variables

A string vector with variable names from data. These variables are added as linear predictors to the model (in order "to control for them"). No interactions with the other variables are modeled.

center.control.variables

Should the control variables be centered before analyses? This can improve interpretability of the intercept, which will then reflect the predicted outcome value at the point (X,Y)=(0,0) when all control variables take their respective average values.

sampling.weights

Name of variable containing sampling weights. Needs to be added here (not in ...) to be included in the analysis dataset.

group_name

Name of variable defining groups, for multigroup modeling.

cluster

Name of variable for clusters, for cluster-level variance correction.

...

Additional parameters passed to the sem.auxiliary function.

Details

This function implements a comparative framework for identifying best-fitting RSA solutions (Núñez-Regueiro & Juhel, 2022, 2024). The default feature ("STEP1") involves the comparison of 37 polynomial families predefined by parametric constraints (against 10 families in the RSA package), to identify likely candidates for best-fitting solution. Step 2 involves probing variants within the retained best-fitting family, by testing user-specific constraints ("USER") on lower-order polynomials that do not define the family. Step 3 can be conducted on the final variant by parametric bootstrapping using bootstrapLavaan(RSA_object$models$name_final, FUN="coef") or cross-validation data.

Value

A list of objects containing polynomials models and names of variables

References

Graham, J. W. (2003). Adding missing-data-relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10(1), 80-100, DOI:10.1207/S15328007SEM1001_4

Núñez-Regueiro, F., Juhel, J. (2022). Model-Building Strategies in Response Surface Analysis Manuscript submitted for publication.

Núñez-Regueiro, F., Juhel, J. (2024). Response Surface Analysis for the Social Sciences I: Identifying Best-Fitting Polynomial Solutions Manuscript submitted for publication.

See Also

RSAmodel, sem.auxiliary


[Package RSAtools version 0.1.1 Index]