simFA {fungible} | R Documentation |
Generate Factor Analysis Models and Data Sets for Simulation Studies
Description
A function to simulate factor loadings matrices and Monte Carlo
data sets for common factor models, bifactor models, and IRT
models.
Usage
simFA(
Model = list(),
Loadings = list(),
CrossLoadings = list(),
Phi = list(),
ModelError = list(),
Bifactor = list(),
MonteCarlo = list(),
FactorScores = list(),
Missing = list(),
Control = list(),
Seed = NULL
)
Arguments
Model |
(list)
|
Loadings |
(list)
-
FacPattern (NULL or matrix).
-
FacPattern = M where M is
a user-defined factor pattern matrix.
-
FacPattern = NULL ; simFA
will generate a factor pattern based on
the arguments specified under other keywords
(e.g., Model , CrossLoadings , etc.);
defaults to FacPattern = NULL .
-
FacLoadDist (character) Specifies the
sampling distribution for the common factor loadings.
Possible values are "runif" , "rnorm" ,
"sequential" , and "fixed" ; defaults
to FacLoadDist = "runif" .
-
FacLoadRange (vector of length NFac ,
2, or 1); defaults to FacLoadRange = c(.3, .7) .
If FacLoadDist = "runif" the vector
defines the bounds of the uniform distribution;
If FacLoadDist = "rnorm" the vector
defines the mean and standard deviation of
the normal distribution from which loadings
are sampled.
If FacLoadDist = "sequential" the
vector specifies the lower and upper bound
of the loadings sequence.
If FacLoadDist = "fixed" and
FacLoadRange is a vector of length 1
then all common loadings will equal the constant
specified in FacLoadRange . If
FacLoadDist = "fixed" and
FacLoadRange is a vector of length
NFac then each factor will have fixed
loadings as specified by the associated
element in FacLoadRange .
-
h2 (vector) An optional vector of communalities
used to constrain the population communalities to
user-defined values; defaults to h2 = NULL .
|
CrossLoadings |
(list)
-
ProbCrossLoad (scalar) A value in the (0,1)
interval that determines the probability that a cross
loading will be present in elements of the loadings
matrix that do not have salient (primary) factor loadings.
If set to ProbCrossLoad = 1 , a single cross
loading will be added to each factor; defaults to
ProbCrossLoad = 0 .
-
CrossLoadRange (vector of length 2) Controls
size of the cross loadings; defaults to
CrossLoadRange = c(.20, .25) .
-
CrossLoadPositions (matrix) Specifies the
row and column positions of (optional) cross loadings;
defaults to CrossLoadPositions = NULL .
-
CrossLoadValues (vector) If
CrossLoadPositions is specified then
CrossLoadValues is a vector of user-supplied
cross-loadings; defaults to CrossLoadValues = NULL .
-
CrudFactor (scalar) Controls the size of
tertiary factor loadings. If CrudFactor != 0
then elements of the loadings matrix with neither
primary nor secondary (i.e., cross) loadings will
be sampled from a \[-(CrudFactor), (CrudFactor)\]
uniform distribution; defaults to CrudFactor = 0 .
|
Phi |
(list)
-
MaxAbsPhi (scalar) Upper (absolute) bound
on factor correlations; defaults to
MaxAbsPhi = .5 .
-
EigenValPower (scalar) Controls the skewness
of the eigenvalues of Phi. Larger values of
EigenValPower result in a Phi spectrum that
is more right-skewed (and thus closer to a
unidimensional model); defaults to
EigenValPower = 2 .
-
PhiType (character); defaults to
PhiType = "free" .
If PhiType = "free" factor correlations
will be randomly generated under the constraints
of MaxAbsPhi and EigenValPower .
If PhiType = "fixed" all factor
correlations will equal the value specified
in MaxAbsPhi . A fatal error will be
produced if Phi is not positive
semidefinite.
If PhiType = "user" the factor
correlations are defined by the matrix
specified in UserPhi (see below).
-
UserPhi (matrix) A positive semidefinite
(PSD) matrix of user-defined factor correlations;
defaults to UserPhi = NULL .
|
ModelError |
(list)
-
ModelError (logical) If ModelError = TRUE
model error will be introduced into the factor
pattern via the method described by Tucker, Koopman,
and Linn (TKL, 1969); defaults to
ModelError = FALSE .
-
W (matrix) An optional user-supplied factor
loading matrix for the NMinorFac minor common
factors; defaults to W = NULL .
-
NMinorFac (scalar) Number of minor factors
in the TKL model; defaults to NMinorFac = 150 .
-
ModelErrorType (character) If
ModelErrorType = "U" then ModelErrorVar
is the proportion of uniqueness variance that is due
to model error. If ModelErrorType = "V" then
ModelErrorVar is the proportion of total
variance that is due to model error; defaults to
ModelErrorType = "U" .
-
ModelErrorVar (scalar \[0,1\]) The proportion
of uniqueness (U) or total (V) variance that is due
to model error; defaults to
ModelErrorVar = .10 .
-
epsTKL (scalar \[0,1\]) Controls the size
of the factor loadings in successive minor factors;
defaults to epsTKL = .20 .
-
Wattempts (scalar > 0) Maximum number of
tries when attempting to generate a suitable W
matrix. Default = 10000.
-
WmaxLoading (scalar > 0) Threshold value for
NWmaxLoading . Default WmaxLoading = .30 .
-
NWmaxLoading (scalar >= 0) Maximum number
of absolute loadings >= WmaxLoading in any
column of W (matrix of model approximation error
factor loadings). Default NWmaxLoading = 2 .
Under the defaults, no column of W will have 3 or
more loadings > |.30|.
-
PrintW (Boolean) If PrintW = TRUE
then simFA will print the attempt history when
searching for a suitable W matrix given the
constraints defined in WmaxLoading and
NWmaxLoading . Default PrintW = FALSE .
-
RSpecific (matrix) Optional correlation
matrix for specific factors;
defaults to RSpecific = NULL .
|
Bifactor |
(list)
Bifactor (logical) If Bifactor = TRUE
parameters for the bifactor model will be generated;
defaults to Bifactor = FALSE .
Hierarchical (logical) If Hierarchical = TRUE
then a hierarchical Schmid Leiman (1957) bifactor
model will be generated;
defaults to Hierarchical = FALSE .
-
F1FactorDist (character) Specifies the
sampling distribution for the general factor loadings.
Possible values are "runif" , "rnorm" ,
"sequential" , and "fixed" ; defaults
to F1FactorDist = "sequential" .
-
F1FactorRange (vector of length 1 or 2)
Controls the sizes of the general factor loadings in
non-hierarchical bifactor models; defaults to
F1FactorRange = c(.4, .7) .
If F1FactorDist = "runif" , the vector
of length 2 defines the bounds of the uniform
distribution, c(lower, upper);
If F1FactorDist = "rnorm" , the
vector defines the mean and standard
deviation of the normal distribution from
which loadings are sampled, c(MN, SD).
If F1FactorDist = "sequential" ,
the vector specifies the lower and upper
bound of the loadings sequence, c(lower, upper).
|
MonteCarlo |
(list)
-
NSamples (integer) Defines number of Monte
Carlo Samples; defaults to NSamples = 0 .
-
SampleSize (integer) Sample size for each
Monte Carlo sample; defaults to SampleSize = 250 .
-
Raw (logical) If Raw = TRUE , simulated
data sets will contain raw data. If Raw = FALSE ,
simulated data sets will contain correlation matrices;
defaults to Raw = FALSE .
-
Thresholds (list) List elements contain
thresholds for each item. Thresholds are required
when generating Likert variables.
|
FactorScores |
(list)
-
FS (logical) If FS = TRUE (true)
factor scores will be simulated; defaults to
FS = FALSE .
-
CFSeed (integer) Optional starting seed for
the common factor scores; defaults to
CFSeed = NULL in which case a random seed is
used.
-
MCFSeed (integer) Optional starting seed
for the minor common factor scores; defaults to
MCFSeed = NULL .
-
SFSeed (integer) Optional starting seed
for the specific factor scores; defaults to
SFSeed = NULL in which case a random seed is
used.
-
EFSeed (integer) Optional starting seed
for the error factor scores; defaults to
EFSeed = NULL in which case a random seed
is used. Note that CFSeed , MCFSeed ,
SFSeed , and EFSeed must be different
numbers (a fatal error is produced when two or more
seeds are specified as equal).
-
VarRel (vector) A vector of manifest variable
reliabilities. The specific factor variance for
variable i will equal VarRel[i] - h^2[i]
(the manifest variable reliability minus its
commonality). By default, VarRel = h^2
(resulting in uniformly zero specific factor
variances).
-
Population (logical) If Population =
TRUE , factor scores will fit the correlational
constraints of the factor model exactly (e.g., the
common factors will be orthogonal to the unique
factors); defaults to Population = FALSE .
-
NFacScores (scalar) Sample size for the
factor scores; defaults to NFacScores = 250 .
-
Thresholds (list) A list of quantiles used
to polychotomize the observed data that will be
generated from the factor scores.
|
Missing |
(list)
Missing (logical) If Missing = TRUE all
data sets will contain missing values; defaults to
Missing = FALSE .
-
Mechanism (character) Specifies the missing
data mechanism. Currently, the program only supports
missing completely at random (MCAR):
Missing = "MCAR" .
-
MSProb (scalar or vector of length
NVar ) Specifies the probability of
missingness for each variable; defaults to
MSprob = 0 .
|
Control |
(list)
-
IRT (logical) If IRT = TRUE then
user-supplied thresholds will be interpreted as
item intercepts; defaults to IRT = FALSE .
-
Dparam (scalar). If Dparam = 1 then item
intercepts should be scaled in the logistic metric.
If Dparam = 1.702 then intercepts should be
scaled in the probit metric.
-
Maxh2 (scalar) Rows of the loadings matrix
will be rescaled to have a maximum communality of
Maxh2 ; defaults to Maxh2 = .98 .
-
Reflect (logical) If Reflect =
TRUE loadings on the common factors will be
randomly reflected; defaults to
Reflect = FALSE .
|
Seed |
(integer) Starting seed for the random number
generator; defaults to Seed = NULL . When no seed
is specified by the user, the program will generate a random
seed.
|
Details
For a complete description of simFA
's
capabilities, users are encouraged to consult the simFABook
at http://users.cla.umn.edu/~nwaller/simFA/simFABook.pdf.
simFA
is a program for exploring factor analysis
models via simulation studies.
After calling simFA
all relevant output can be saved
for further processing by calling one or more of the following
object names.
Value
-
loadings
A common factor or bifactor
loadings matrix.
-
Phi
A factor correlation matrix.
-
urloadings
The unrotated loadings matrix.
-
h2
A vector of item communalities.
-
h2PopME
A vector item communalities that
may include model approximation error.
-
Rpop
The model-implied population correlation
matrix.
-
RpopME
The model-implied population
correlation matrix with model error.
-
W
The factor loadings for the minor factors
(when ModelError = TRUE
). Default = NULL.
-
Xm
That part of the observed scores that
is due to the minor common factors.
-
SFSvars
Variances of the Specific Factors
in the metric of the observed scores.
-
ModelErrorFitStats
A list of model fit
indices (for the underlying equations, see: Bentler,
1990; Hu & Bentler, 1999; Marsh, Hau, & Grayson,
2005; Steiger, 2016):
-
SRMR_theta
Standardized Root Mean
Square Residual based on the model that is
implied by the error free major factors
only (underlying Rpop),
-
SRMR_thetahat
Standardized Root
Mean Square Residual based on an exploratory
factor analysis of the population
correlation matrix, RpopME,
-
CRMR_theta
Correlation Root Mean
Square Residual based on the model that is
implied by the error free major factors
only (underlying Rpop),
-
CRMR_thetahat
Correlation Root Mean
Square Residual based on an exploratory factor
analysis of the population correlation matrix,
RpopME,
-
RMSEA_theta
Root Mean Square Error
of Approximation (Steiger, 2016) based on the
model that is implied by the error free major
factors only (underlying Rpop),
-
RMSEA_thetahat
Root Mean Square
Error of Approximation (Steiger, 2016) based
on an exploratory factor analysis of the
population correlation matrix, RpopME,
-
CFI_theta
Comparative Fit Index
(Bentler, 1990) based on the model that is
implied by the error free major factors
only (underlying Rpop),
-
CFI_thetahat
Comparative Fit Index
(Bentler, 1990) based on an exploratory
factor analysis of the population
correlation matrix, RpopME.
-
Fm
MLE fit function for population
target model.
-
Fb
MLE fit function for population
baseline model.
-
DFm
Degrees of freedom for
population target model.
-
CovMatrices
A list containing:
-
CovMajor
The model implied
covariances from the major factors.
-
CovMinor
The model implied
covariances from the minor factors.
-
CovUnique
The model implied
variances from the uniqueness factors.
-
Bifactor
A list containing:
-
Scores
A list containing:
-
FactorScores
Factor scores for the
common and uniqueness factors.
-
FacInd
Factor indeterminacy indices
for the error free population model.
-
FacIndME
Factor score indeterminacy
indices for the population model with model
error.
-
ObservedScores
A matrix of model
implied ObservedScores
. If
Thresholds
were supplied under
Keyword FactorScores
,
ObservedScores
will be transformed
into Likert scores.
-
Monte
A list containing output from the
Monte Carlo simulations if generated.
-
IRT
Factor loadings expressed in the normal
ogive IRT metric. If Thresholds
were given
then IRT difficulty values will also be returned.
-
Seed
The initial seed for the random
number generator.
-
call
A copy of the function call.
-
cn
A list of all active and nonactive
function arguments.
Author(s)
Niels G. Waller with contributions by Hoang V. Nguyen
References
Bentler, P. M. (1990). Comparative fit indexes in structural
models. Psychological Bulletin, 107(2), 238–246.
Hu, L.-T. & Bentler, P. M. (1999). Cutoff criteria for fit
indexes in covariance structure analysis: Conventional criteria
versus new alternatives. Structural Equation Modeling:
A Multidisciplinary Journal, 6(1), 1–55.
Marsh, H. W., Hau, K.-T., & Grayson, D. (2005). Goodness of fit
in structural equation models. In A. Maydeu-Olivares & J. J.
McArdle (Eds.), Multivariate applications book series.
Contemporary psychometrics: A festschrift for Roderick P.
McDonald (p. 275–340). Lawrence Erlbaum Associates Publishers.
Schmid, J. and Leiman, J. M. (1957). The development of hierarchical
factor solutions. Psychometrika, 22(1), 53–61.
Steiger, J. H. (2016). Notes on the Steiger–Lind (1980) handout.
Structural Equation Modeling: A Multidisciplinary Journal, 23:6,
777-781.
Tucker, L. R., Koopman, R. F., and Linn, R. L. (1969). Evaluation
of factor analytic research procedures by means of simulated
correlation matrices. Psychometrika, 34(4), 421–459.
Examples
## Not run:
# Ex 1. Three Factor Simple Structure Model with Cross loadings and
# Ideal Non salient Loadings
out <- simFA(Seed = 1)
print( round( out$loadings, 2 ) )
# Ex 2. Non Hierarchical bifactor model 3 group factors
# with constant loadings on the general factor
out <- simFA(Bifactor = list(Bifactor = TRUE,
Hierarchical = FALSE,
F1FactorRange = c(.4, .4),
F1FactorDist = "runif"),
Seed = 1)
print( round( out$loadings, 2 ) )
# Ex 3. Model Fit Statistics for Population Data with
# Model Approximation Error. Three Factor model.
out <- simFA(Loadings = list(FacLoadDist = "fixed",
FacLoadRange = .5),
ModelError = list(ModelError = TRUE,
NMinorFac = 150,
ModelErrorType = "V",
ModelErrorVar = .1,
Wattempts = 10000,
epsTKL = .2),
Seed = 1)
print( out$loadings )
print( out$ModelErrorFitStats[seq(2,8,2)] )
## End(**Not run**)
[Package
fungible version 2.4.4
Index]