factor_forest {latentFactoR}R Documentation

Estimate Number of Dimensions using Factor Forest

Description

Estimates the number of dimensions in data using the pre-trained Random Forest model from Goretzko and Buhner (2020, 2022). See examples to get started

Usage

factor_forest(
  data,
  sample_size,
  maximum_factors = 8,
  PA_correlation = c("cor", "poly", "tet")
)

Arguments

data

Matrix or data frame. Either a dataset with all numeric values (rows = cases, columns = variables) or a symmetric correlation matrix

sample_size

Numeric (length = 1). If input into data is a correlation matrix, then specifying the sample size is required

maximum_factors

Numeric (length = 1). Maximum number of factors to search over. Defaults to 8

PA_correlation

Character (length = 1). Type of correlation used in fa.parallel. Must be set:

  • "cor" — Pearson's correlation

  • "poly" — Polychoric correlation

  • "tet" — Tetrachoric correlation

Value

Returns a list containing:

dimensions

Number of dimensions identified

probabilities

Probability that the number of dimensions is most likely

Author(s)

# Authors of Factor Forest
David Goretzko and Markus Buhner

# Authors of {latentFactoR}
Alexander P. Christensen <alexpaulchristensen@gmail.com>, Hudson Golino <hfg9s@virginia.edu>, Luis Eduardo Garrido <luisgarrido@pucmm.edu>

References

Goretzko, D., & Buhner, M. (2022). Factor retention using machine learning with ordinal data. Applied Psychological Measurement, 01466216221089345.

Goretzko, D., & Buhner, M. (2020). One model to rule them all? Using machine learning algorithms to determine the number of factors in exploratory factor analysis. Psychological Methods, 25(6), 776-786.

Examples

# Generate factor data
two_factor <- simulate_factors(
  factors = 2, # factors = 2
  variables = 6, # variables per factor = 6
  loadings = 0.55, # loadings between = 0.45 to 0.65
  cross_loadings = 0.05, # cross-loadings N(0, 0.05)
  correlations = 0.30, # correlation between factors = 0.30
  sample_size = 1000 # number of cases = 1000
)

## Not run: 
# Perform Factor Forest
factor_forest(two_factor$data)
## End(Not run)


[Package latentFactoR version 0.0.6 Index]