reduce_parameters {parameters}R Documentation

Dimensionality reduction (DR) / Features Reduction

Description

This function performs a reduction in the parameter space (the number of variables). It starts by creating a new set of variables, based on the given method (the default method is "PCA", but other are available via the method argument, such as "cMDS", "DRR" or "ICA"). Then, it names this new dimensions using the original variables that correlates the most with it. For instance, a variable named 'V1_0.97/V4_-0.88' means that the V1 and the V4 variables correlate maximally (with respective coefficients of .97 and -.88) with this dimension. Although this function can be useful in exploratory data analysis, it's best to perform the dimension reduction step in a separate and dedicated stage, as this is a very important process in the data analysis workflow. reduce_data() is an alias for reduce_parameters.data.frame().

Usage

reduce_parameters(x, method = "PCA", n = "max", distance = "euclidean", ...)

reduce_data(x, method = "PCA", n = "max", distance = "euclidean", ...)

Arguments

x

A data frame or a statistical model.

method

The feature reduction method. Can be one of "PCA", "cMDS", "DRR", "ICA" (see the 'Details' section).

n

Number of components to extract. If n="all", then n is set as the number of variables minus 1 (ncol(x)-1). If n="auto" (default) or n=NULL, the number of components is selected through n_factors() resp. n_components(). Else, if n is a number, n components are extracted. If n exceeds number of variables in the data, it is automatically set to the maximum number (i.e. ncol(x)). In reduce_parameters(), can also be "max", in which case it will select all the components that are maximally pseudo-loaded (i.e., correlated) by at least one variable.

distance

The distance measure to be used. Only applies when method = "cMDS". This must be one of "euclidean", "maximum", "manhattan", "canberra", "binary" or "minkowski". Any unambiguous substring can be given.

...

Arguments passed to or from other methods.

Details

The different methods available are described below:

Supervised Methods

See also package vignette.

References

Examples

data(iris)
model <- lm(Sepal.Width ~ Species * Sepal.Length + Petal.Width, data = iris)
model
reduce_parameters(model)

out <- reduce_data(iris, method = "PCA", n = "max")
head(out)

[Package parameters version 0.22.1 Index]