UVA {EGAnet}  R Documentation 
Unique Variable Analysis
Description
Identifies redundant variables in a multivariate dataset
using a number of different association methods and types of significance values
(see Christensen, Garrido, & Golino, 2020 for more details)
Usage
UVA(
data,
n = NULL,
model = c("glasso", "TMFG"),
corr = c("cor_auto", "pearson", "spearman"),
method = c("cor", "pcor", "wTO"),
type = c("adapt", "alpha", "threshold"),
sig,
key = NULL,
reduce = TRUE,
auto = TRUE,
label_latent = TRUE,
reduce.method = c("latent", "remove", "sum"),
lavaan.args = list(),
adhoc = TRUE,
plot.redundancy = FALSE,
plot.args = list()
)
Arguments
data 
Matrix or data frame.
Input can either be data or a correlation matrix

n 
Numeric.
If input in data is a correlation matrix,
then sample size is required.
Defaults to NULL

model 
Character.
A string indicating the method to use.
Current options are:
glasso
Estimates the Gaussian graphical model using graphical LASSO with
extended Bayesian information criterion to select optimal regularization parameter.
This is the default method
TMFG
Estimates a Triangulated Maximally Filtered Graph

corr 
Type of correlation matrix to compute. The default uses cor_auto .
Current options are:
cor_auto
Computes the correlation matrix using the cor_auto function from
qgraph .
pearson
Computes Pearson's correlation coefficient using the pairwise complete observations via
the cor function.
spearman
Computes Spearman's correlation coefficient using the pairwise complete observations via
the cor function.

method 
Character.
Computes weighted topological overlap ("wTO" using EBICglasso ),
partial correlations ("pcor" ), or correlations ("cor" )
Defaults to "wTO"

type 
Character. Type of significance.
Computes significance using the standard pvalue ("alpha" ),
adaptive alpha pvalue (adapt.a ),
or some threshold "threshold" .
Defaults to "threshold"

sig 
Numeric.
pvalue for significance of overlap (defaults to .05 ).
Defaults for "threshold" for each method :
"wTO"
.25
"pcor"
.35
"cor"
.50

key 
Character vector.
A vector with variable descriptions that correspond
to the order of variables input into data .
Defaults to NULL or the column names of data

reduce 
Boolean.
Should redundancy reduction be performed?
Defaults to TRUE .
Set to FALSE for redundancy analysis only

auto 
Boolean.
Should redundancy reduction be automated?
Defaults to TRUE .
Set to FALSE for manual selection

label_latent 
Boolean.
Should latent variables be labelled?
Defaults to TRUE .
Set to FALSE for arbitrary labelling (i.e., "LV_")

reduce.method 
Character.
How should data be reduced?
Defaults to "latent"
"latent"
Redundant variables will be combined into a latent variable
"remove"
All but one redundant variable will be removed
"sum"
Redundant variables are combined by summing across cases (rows)

lavaan.args 
List.
If reduce.method = "latent" , then lavaan 's cfa
function will be used to create latent variables to reduce variables.
Arguments should be input as a list. Some example arguments
(see lavOptions for full details ):
estimator
Estimator to use for latent variables (see Estimators)
for more details. Defaults to "MLR" for continuous data and "WLSMV" for mixed and categorical data.
Data are considered continuous data if they have 6 or more categories (see Rhemtulla, BrosseauLiard, & Savalei, 2012)
missing
How missing data should be handled. Defaults to "fiml"
std.lv
If TRUE , the metric of each latent variable is determined by fixing their (residual) variances to 1.0.
If FALSE , the metric of each latent variable is determined by fixing the factor loading of the first
indicator to 1.0. If there are multiple groups, std.lv = TRUE and "loadings" is included in the
group.label argument, then only the latent variances i of the first group will be fixed to 1.0, while
the latent variances of other groups are set free.
Defaults to TRUE

adhoc 
Boolean.
Should adhoc check of redundancies be performed?
Defaults to TRUE .
If TRUE , adhoc check will run the redundancy analysis
on the reduced variable set to determine if there are any remaining
redundancies. This check is performed with the arguments:
method = "wTO" , type = "threshold" , and sig = .20 .
This check is based on Christensen, Garrido, and Golino's (2020)
simulation where these parameters were found to be the most conservative,
demonstrating few false positives and false negatives

plot.redundancy 
Boolean.
Should redundancies be plotted in a network plot?
Defaults to FALSE

plot.args 
List.
Arguments to be passed onto ggnet2 .
Defaults:
vsize = 6 Changes node size
alpha = 0.4 Changes transparency
label.size = 5 Changes label size
edge.alpha = 0.7 Changes edge transparency

Value
Returns a list:
redundancy 
A list containing several objects:
redudant
Vectors nested within the list corresponding
to redundant nodes with the name of object in the list
data
Original data
correlation
Correlation matrix of original data
weights
Weights determine by weighted topological overlap,
partial correlation, or zeroorder correlation
network
If method = "wTO" , then
the network computed following EGA with
EBICglasso network estimation
plot
If redundancy.plot = TRUE , then
a plot of all redundancies found
descriptives
basic
A vector containing the mean, standard deviation,
median, median absolute deviation (MAD), 3 times the MAD, 6 times the MAD,
minimum, maximum, and critical value for the overlap measure
(i.e., weighted topological overlap, partial correlation, or threshold)
centralTendency
A matrix for all (absolute) nonzero values and their
respective standard deviation from the mean and median absolute deviation
from the median
method
Returns method argument
type
Returns type argument
distribution
If type != "threshold" , then
distribution that was used to determine significance

reduced 
If reduce = TRUE , then a list containing:
data
New data with redundant variables merged or removed
merged A matrix containing the variables that were
decided to be redundant with one another
method Method used to perform redundancy reduction

adhoc 
If adhoc = TRUE , then
the adhoc check containing the same objects as in
the redundancy list object in the output

Author(s)
Alexander Christensen <alexpaulchristensen@gmail.com>
References
# Simulation using UVA
Christensen, A. P., Garrido, L. E., & Golino, H. (under review).
Unique Variable Analysis: A novel approach for detecting redundant variables in multivariate data.
PsyArXiv.
# Implementation of UVA
(formally node.redundant
)
Christensen, A. P., Golino, H., & Silvia, P. J. (2020).
A psychometric network perspective on the validity and validation of personality trait questionnaires.
European Journal of Personality, 34, 10951108.
# wTO measure
Nowick, K., Gernat, T., Almaas, E., & Stubbs, L. (2009).
Differences in human and chimpanzee gene expression patterns define an evolving network of transcription factors in brain.
Proceedings of the National Academy of Sciences, 106, 2235822363.
# Selection of CFA Estimator
Rhemtulla, M., BrosseauLiard, P. E., & Savalei, V. (2012).
When can categorical variables be treated as continuous? A comparison of robust continuous and categorical SEM estimation methods under suboptimal conditions.
Psychological Methods, 17, 354373.
Examples
# Select Five Factor Model personality items only
idx < na.omit(match(gsub("", "", unlist(psychTools::spi.keys[1:5])), colnames(psychTools::spi)))
items < psychTools::spi[,idx]
# Change names in redundancy output to each item's description
key.ind < match(colnames(items), as.character(psychTools::spi.dictionary$item_id))
key < as.character(psychTools::spi.dictionary$item[key.ind])
# Automated selection of redundant variables (default)
uva.results < UVA(data = items, key = key)
# Manual selection of redundant variables
if(interactive()){
uva.results < UVA(data = items, key = key, type = "adapt")
}
[Package
EGAnet version 1.1.0
Index]