R: Estimate a list of polynomial models for RSA, using auxiliary...

RSAmodel.auxiliary {RSAtools}

R Documentation

Estimate a list of polynomial models for RSA, using auxiliary variables in FIML

Description

Estimate any number of predefined or user-specific polynomial models, using auxiliary variables for missing data treatment by full information maximum likelihood (Graham, 2003). Based on the semTools function sem.auxiliary. Works in the same way as RSAmodel, only differing by the addition of the auxiliary variable option ("aux" option).

Usage

RSAmodel.auxiliary(
  formula,
  data = NULL,
  aux = NULL,
  center = "none",
  scale = "none",
  na.rm = FALSE,
  out.rm = TRUE,
  breakline = FALSE,
  models = c("CUBIC", "STEP1"),
  user_model = NULL,
  verbose = TRUE,
  add = "",
  estimator = "MLR",
  se = "robust",
  missing = NA,
  control.variables = NULL,
  center.control.variables = FALSE,
  sampling.weights = NULL,
  group_name = NULL,
  cluster = NULL,
  ...
)

Arguments

`formula`	A formula in the form `z ~ x*y`, specifying the variable names used from the data frame, where z is the name of the response variable, and x and y are the names of the predictor variables.
`data`	A data frame with the variables
`aux`	`character`. Names of auxiliary variables to add to `model`. Passed on internally to `sem.auxiliary` (semTools package)
`center`	Method for centering the predictor variables before the analysis. Default option ("none") applies no centering. "pooled" centers the predictor variables on their pooled sample mean. "variablewise" centers the predictor variables on their respective sample mean. You should think carefully before applying the "variablewise" option, as centering or reducing the predictor variables on common values (e.g., their grand means and SDs) can affect the commensurability of the predictor scales.
`scale`	Method for scaling the predictor variables before the analysis. Default option ("none") applies no scaling. "pooled" scales the predictor variables on their pooled sample SD, which preserves the commensurability of the predictor scales. "variablewise" scales the predictor variables on their respective sample SD. You should think carefully before applying the "variablewise" option, as scaling the predictor variables at different values (e.g., their respective SDs) can affect the commensurability of the predictor scales.
`na.rm`	Remove missings before proceeding?
`out.rm`	Should outliers according to Bollen & Jackman (1980) criteria be excluded from the analyses? In large data sets this analysis is the speed bottleneck. If you are sure that no outliers exist, set this option to FALSE for speed improvements.
`breakline`	Should the breakline in the unconstrained absolute difference model be allowed (the breakline is possible from the model formulation, but empirically rather unrealistic ...). Defaults to `FALSE`
`models`	A vector with names of all models that should be computed. Should be any from c("CUBIC","FM1_ONLYX","FM2_ONLYY","FM3_ADDITIVE","FM4_INTER","FM5_QUADX","FM6_QUADY","FM7_CONG","FM8_INCONG","FM9_CURVCONGX","FM10_CURVCONGY","FM11_CURVINCONGX","FM12_CURVINCONGY","FM13_QUADXQUADY","FM14_ROTCONG","FM15_ROTINCONG","FM16_CUBICX","FM17_CUBICY","FM18_LEVDEPQUADX","FM19_LEVDEPQUADY","FM20_ASYMCONG","FM21_ASYMINCONG","FM22_LEVDEPCONG","FM23_LEVDEPINCONG","FM24_PARALLELASYM","FM25_NONPARALLELASYM","FM26_PARALLELASYMWEAK","FM27_PARALLELASYMSTRONG","FM28_NONPARALLELASYMWEAK","FM29_NONPARALLELASYMSTRONG","FM30_ASYMCONGROTY","FM31_ASYMCONGROTX","FM32_ASYMINCONGROTY","FM33_ASYMINCONGROTX","FM34_LEVDEPCONGROTY","FM35_LEVDEPCONGROTX","FM36_LEVDEPINCONGROTY","FM37_LEVDEPINCONGROTX"). For `models="STEP1"`, all polynomial families and the saturated cubic are computed (default), for `models="USER"` all user-specific models defined in the list "user_model" are computed.
`user_model`	A list of user-specified polynomial models, defined by setting constraints on the polynomial parameters b1 to b9, using the syntax in lavaan, for example: "b1 == 2*b2". Only parametric constraints specifications are allowed.
`verbose`	Should additional information during the computation process be printed?
`add`	Additional syntax that is added to the lavaan model. Can contain, for example, additional constraints, like "p01 == 0; p11 == 0"
`estimator`	Type of estimator that should be used by lavaan. Defaults to "MLR", which provides robust standard errors, a robust scaled test statistic, and can handle missing values. If you want to reproduce standard OLS estimates, use `estimator="ML"` and `se="standard"`
`se`	Type of standard errors. This parameter gets passed through to the `sem` function of the `lavaan` package. See options there. By default, robust SEs are computed. If you use `se="boot"`, `lavaan` provides CIs and p-values based on the bootstrapped standard error. If you use `confint(..., method="boot")`, in contrast, you get CIs and p-values based on percentile bootstrap.
`missing`	Handling of missing values (this parameter is passed to the `lavaan` `sem` function). By default (`missing=NA`), Full Information Maximum Likelihood (FIML) is employed in case of missing values. If cases with missing values should be excluded, use `missing = "listwise"`.
`control.variables`	A string vector with variable names from `data`. These variables are added as linear predictors to the model (in order "to control for them"). No interactions with the other variables are modeled.
`center.control.variables`	Should the control variables be centered before analyses? This can improve interpretability of the intercept, which will then reflect the predicted outcome value at the point (X,Y)=(0,0) when all control variables take their respective average values.
`sampling.weights`	Name of variable containing sampling weights. Needs to be added here (not in ...) to be included in the analysis dataset.
`group_name`	Name of variable defining groups, for multigroup modeling.
`cluster`	Name of variable for clusters, for cluster-level variance correction.
`...`	Additional parameters passed to the `sem.auxiliary` function.

Details

This function implements a comparative framework for identifying best-fitting RSA solutions (Núñez-Regueiro & Juhel, 2022, 2024). The default feature ("STEP1") involves the comparison of 37 polynomial families predefined by parametric constraints (against 10 families in the RSA package), to identify likely candidates for best-fitting solution. Step 2 involves probing variants within the retained best-fitting family, by testing user-specific constraints ("USER") on lower-order polynomials that do not define the family. Step 3 can be conducted on the final variant by parametric bootstrapping using bootstrapLavaan(RSA_object$models$name_final, FUN="coef") or cross-validation data.

Value

A list of objects containing polynomials models and names of variables

References

Graham, J. W. (2003). Adding missing-data-relevant variables to FIML-based structural equation models. Structural Equation Modeling, 10(1), 80-100, DOI:10.1207/S15328007SEM1001_4

Núñez-Regueiro, F., Juhel, J. (2022). Model-Building Strategies in Response Surface Analysis Manuscript submitted for publication.

Núñez-Regueiro, F., Juhel, J. (2024). Response Surface Analysis for the Social Sciences I: Identifying Best-Fitting Polynomial Solutions Manuscript submitted for publication.