EFA_AVERAGE {EFAtools} | R Documentation |
Model averaging across different EFA methods and types
Description
Not all EFA procedures always arrive at the same solution. This function allows
you perform a number of EFAs from different methods (e.g., Maximum Likelihood
and Principal Axis Factoring), with different implementations (e.g., the SPSS
and psych implementations of Principal Axis Factoring), and across different
rotations of the same type (e.g., multiple oblique rotations, like promax and
oblimin). EFA_AVERAGE will then run all these EFAs (using the EFA
function) and provide a summary across the different solutions.
Usage
EFA_AVERAGE(
x,
n_factors,
N = NA,
method = "PAF",
rotation = "promax",
type = "none",
averaging = c("mean", "median"),
trim = 0,
salience_threshold = 0.3,
max_iter = 10000,
init_comm = c("smc", "mac", "unity"),
criterion = c(0.001),
criterion_type = c("sum", "max_individual"),
abs_eigen = c(TRUE),
varimax_type = c("svd", "kaiser"),
normalize = TRUE,
k_promax = 2:4,
k_simplimax = ncol(x),
P_type = c("norm", "unnorm"),
precision = 1e-05,
start_method = c("psych", "factanal"),
use = c("pairwise.complete.obs", "all.obs", "complete.obs", "everything",
"na.or.complete"),
cor_method = c("pearson", "spearman", "kendall"),
show_progress = TRUE
)
Arguments
x |
data.frame or matrix. Dataframe or matrix of raw data or matrix with correlations. If raw data is entered, the correlation matrix is found from the data. |
n_factors |
numeric. Number of factors to extract. |
N |
numeric. The number of observations. Needs only be specified if a
correlation matrix is used. If input is a correlation matrix and |
method |
character vector. Any combination of "PAF", "ML", and "ULS", to use principal axis factoring, maximum likelihood, or unweighted least squares (also called minres), respectively, to fit the EFAs. Default is "PAF". |
rotation |
character vector. Either perform no rotation ("none"), any combination of orthogonal rotations ("varimax", "equamax", "quartimax", "geominT", "bentlerT", and "bifactorT"; using "orthogonal" runs all of these), or of oblique rotations ("promax", "oblimin", "quartimin", "simplimax", "bentlerQ", "geominQ", and "bifactorQ"; using "oblique" runs all of these). Rotation types (no rotation, orthogonal rotations, and oblique rotations) cannot be mixed. Default is "promax". |
type |
character vector. Any combination of "none" (default), "EFAtools",
"psych", and "SPSS" can be entered. "none" allows the specification of various
combinations of the arguments controlling both factor extraction methods and
the rotations. The others ("EFAtools", "psych", and "SPSS"), control the execution
of the respective factor extraction method and rotation to be in line with how
it is executed in this package (i.e., the respective default procedure), in the
psych package, and in SPSS. A specific psych implementation exists for PAF, ML, varimax,
and promax. The SPSS implementation exists for PAF, varimax, and promax. For
details, see |
averaging |
character. One of "mean" (default), and "median". Controls whether the different results should be averaged using the (trimmed) mean, or the median. |
trim |
numeric. If averaging is set to "mean", this argument controls
the trimming of extremes (for details see |
salience_threshold |
numeric. The threshold to use to classify a pattern coefficient or loading as salient (i.e., substantial enough to assign it to a factor). Default is 0.3. Indicator-to-factor correspondences will be inferred based on this threshold. Note that this may not be meaningful if rotation = "none" and n_factors > 1 are used, as no simple structure is present there. |
max_iter |
numeric. The maximum number of iterations to perform after which the iterative PAF procedure is halted with a warning. Default is 10,000. Note that non-converged procedures are excluded from the averaging procedure. |
init_comm |
character vector. Any combination of "smc", "mac", and "unity".
Controls the methods to estimate the initial communalities in |
criterion |
numeric vector. The convergence criterion used for PAF if
"none" is among the specified types.
If the change in communalities from one iteration to the next is smaller than
this criterion the solution is accepted and the procedure ends.
Default is |
criterion_type |
character vector. Any combination of "max_individual" and
"sum". Type of convergence criterion used for PAF if "none" is among the
specified types. "max_individual" selects the maximum change in any of the
communalities from one iteration to the next and tests it against the
specified criterion. "sum" takes the difference of
the sum of all communalities in one iteration and the sum of all communalities
in the next iteration and tests this against the criterion
(for details see |
abs_eigen |
logical vector. Any combination of TRUE and FALSE.
Which algorithm to use in the PAF iterations if "none" is among the specified
types. If FALSE, the loadings are computed from the eigenvalues. This is also
used by the |
varimax_type |
character vector. Any combination of "svd" and "kaiser".
The type of the varimax rotation performed if "none" is among the specified
types and "varimax", "promax", "orthogonal", or "oblique" is among the specified
rotations. "svd" uses singular value decomposition, as
stats::varimax does, and "kaiser" uses the varimax
procedure performed in SPSS. This is the original procedure from Kaiser (1958),
but with slight alterations in the varimax criterion (for details, see
|
normalize |
logical vector. Any combination of TRUE and FALSE.
|
k_promax |
numeric vector. The power used for computing the target matrix
P in the promax rotation if "none" is among the specified types and "promax"
or "oblique" is among the specified rotations. Default is |
k_simplimax |
numeric. The number of 'close to zero loadings' for the
simplimax rotation (see |
P_type |
character vector. Any combination of "norm" and "unnorm".
This specifies how the target matrix P is computed in promax rotation if
"none" is among the specified types and "promax" or "oblique" is among the
specified rotations. "unnorm" will use the unnormalized target matrix as
originally done in Hendrickson and White (1964). "norm" will use a
normalized target matrix (for details see |
precision |
numeric vector. The tolerance for stopping in the rotation procedure(s). Default is 10^-5. |
start_method |
character vector. Any combination of "psych" and "factanal".
How to specify the starting values for the optimization procedure for ML.
"psych" takes the starting values specified in psych::fa.
"factanal" takes the starting values specified in the
stats::factanal function. Default is
|
use |
character. Passed to |
cor_method |
character. Passed to |
show_progress |
logical. Whether a progress bar should be shown in the console. Default is TRUE. |
Details
As a first step in this function, a grid is produced containing the setting
combinations for the to-be-performed EFAs. These settings are then entered as
arguments to the EFA
function and the EFAs are run in a second
step. After all EFAs are run, the factor solutions are averaged and their
variability determined in a third step.
The grid containing the setting combinations is produced based on the entries
to the respective arguments. To this end, all possible combinations resulting
in unique EFA models are considered. That is, if, for example, the type
argument was set to c("none", "SPSS")
and one combination of the specific
settings entered was identical to the SPSS combination, this combination
would be included in the grid and run only once. We include here a list
of arguments that are only evaluated under specific conditions:
The arguments init_comm
, criterion
, criterion_type
,
abs_eigen
are only evaluated if "PAF" is included in method
and "none" is included in type
.
The argument varimax_type
is only evaluated if "varimax", "promax",
"oblique", or "orthogonal" is included in rotation
and "none" is
included in type
.
The argument normalize
is only evaluated if rotation
is not
set to "none" and "none" is included in type
.
The argument k_simplimax
is only evaluated if "simplimax" or "oblique"
is included in rotation
.
The arguments k_promax
and P_type
are only evaluated if
"promax" or "oblique" is included in rotation
and "none" is included
in type
.
The argument start_method
is only evaluated if "ML" is included in
method
.
To avoid a bias in the averaged factor solutions from problematic solutions,
these are excluded prior to averaging. A solution is deemed problematic if
at least one of the following is true: an error occurred, the model did not
converge, or there is at least one Heywood case (defined as a loading or communality of >= .998).
Information on errors, convergence, and Heywood cases are returned in the
implementations_grid and a summary of these is given when printing the output.
In addition to these, information on the admissibility of the factor solutions
is also included. A solution was deemed admissible if (1) no error occurred,
(2) the model converged, (3) no Heywood cases are present, and (4) there are
at least two salient loadings (i.e., loadings exceeding the specified
salience_threshold
) for each factor. So, solutions failing one of the
first three of these criteria of admissibility are also deemed problematic and
therefore excluded from averaging. However, solutions failing only
the fourth criterion of admissibility are still included for averaging.
Finally, if all solutions are problematic (e.g., all solutions contain
Heywood cases), no averaging is performed and the respective outputs are NA.
In this case, the implementations_grid should be inspected to see if there
are any error messages, and the separate EFA solutions that are also included
in the output can be inspected as well, for example, to see where Heywood
cases occurred.
A core output of this function includes the average, minimum, and maximum loadings derived from all non-problematic (see above) factor solutions. Please note that these are not entire solutions, but the matrices include the average, minimum, or maximum value for each cell (i.e., each loading separately). This means that, for example, the matrix with the minimum loadings will contain the minimum value in any of the factor solutions for each specific loading, and therefore most likely contains loadings from different factor solutions. The matrices containing the minimum and maximum factor solutions can therefore not be interpreted as whole factor solutions.
The output also includes information on the average, minimum, maximum, and
variability of the fit indices across the non-problematic factor solutions.
It is important to note that not all fit indices are computed for all fit
methods: For ML and ULS, all fit indices can be computed, while for PAF, only
the common part accounted for (CAF) index (Lorenzo-Seva, Timmerman, & Kiers, 2011)
can be computed. As a consequence, if only "PAF" is included in the
method
argument, averaging can only be performed for the CAF, and the
other fit indices are NA. If a combination of "PAF" and "ML" and/or "ULS" are
included in the method
argument, the CAF is averaged across all non-
problematic factor solutions, while all other fit indices are only averaged
across the ML and ULS solutions. The user should therefore keep in mind that
the number of EFAs across which the fit indices are averaged can diverge for
the CAF compared to all other fit indices.
Value
A list of class EFA_AVERAGE containing
orig_R |
Original correlation matrix. |
h2 |
A list with the average, standard deviation, minimum, maximum, and range of the final communality estimates across the factor solutions. |
loadings |
A list with the average, standard deviation, minimum, maximum, and range of the final loadings across the factor solutions. If rotation was "none", the unrotated loadings, otherwise the rotated loadings (pattern coefficients). |
Phi |
A list with the average, standard deviation, minimum, maximum, and range of the factor intercorrelations across factor solutions obtained with oblique rotations. |
ind_fac_corres |
A matrix with each cell containing the proportion of the factor solutions in which the respective indicator-to-factor correspondence occurred, i.e., in which the loading exceeded the specified salience threshold. Note: Rowsums can exceed 1 due to cross-loadings. |
vars_accounted |
A list with the average, standard deviation, minimum, maximum, and range of explained variances and sums of squared loadings across the factor solutions. Based on the unrotated loadings. |
fit_indices |
A matrix containing the average, standard deviation, minimum, maximum, and range for all applicable fit indices across the respective factor solutions, and the degrees of freedom (df). If the method argument contains ML or ULS: Fit indices derived from the unrotated factor loadings: Chi Square (chisq), including significance level, Comparative Fit Index (CFI), Root Mean Square Error of Approximation (RMSEA), Akaike Information Criterion (AIC), Bayesian Information Criterion (BIC)and the common part accounted for (CAF) index as proposed by Lorenzo-Seva, Timmerman, & Kiers (2011). For PAF, only the CAF can be calculated (see details). |
implementations_grid |
A matrix containing, for each performed EFA,
the setting combination, if an error occurred (logical), the error message
(character), an integer code for convergence as returned by
|
efa_list |
A list containing the outputs of all performed EFAs. The names correspond to the rownames from the implementations_grid. |
settings |
A list of the settings used. |
Source
Grieder, S., & Steiner, M.D. (2020). Algorithmic Jingle Jungle: A Comparison of Implementations of Principal Axis Factoring and Promax Rotation in R and SPSS. Manuscript in Preparation.
Hendrickson, A. E., & White, P. O. (1964). Promax: A quick method for rotation to oblique simple structure. British Journal of Statistical Psychology, 17 , 65–70. doi: 10.1111/j.2044-8317.1964.tb00244.x
Lorenzo-Seva, U., Timmerman, M. E., & Kiers, H. A. L. (2011). The Hull Method for Selecting the Number of Common Factors, Multivariate Behavioral Research, 46, 340-364, doi: 10.1080/00273171.2011.564527
Kaiser, H. F. (1958). The varimax criterion for analytic rotation in factor analysis. Psychometrika, 23, 187–200. doi: 10.1007/BF02289233
Examples
## Not run:
# Averaging across different implementations of PAF and promax rotation (72 EFAs)
Aver_PAF <- EFA_AVERAGE(test_models$baseline$cormat, n_factors = 3, N = 500)
# Use median instead of mean for averaging (72 EFAs)
Aver_PAF_md <- EFA_AVERAGE(test_models$baseline$cormat, n_factors = 3, N = 500,
averaging = "median")
# Averaging across different implementations of PAF and promax rotation,
# and across ULS and different versions of ML (108 EFAs)
Aver_meth_ext <- EFA_AVERAGE(test_models$baseline$cormat, n_factors = 3, N = 500,
method = c("PAF", "ULS", "ML"))
# Averaging across one implementation each of PAF (EFAtools type), ULS, and
# ML with one implementation of promax (EFAtools type) (3 EFAs)
Aver_meth <- EFA_AVERAGE(test_models$baseline$cormat, n_factors = 3, N = 500,
method = c("PAF", "ULS", "ML"), type = "EFAtools",
start_method = "psych")
# Averaging across different oblique rotation methods, using one implementation
# of ML and one implementation of promax (EFAtools type) (7 EFAs)
Aver_rot <- EFA_AVERAGE(test_models$baseline$cormat, n_factors = 3, N = 500,
method = "ML", rotation = "oblique", type = "EFAtools",
start_method = "psych")
## End(Not run)