R: Linear discriminant analysis

ldaPlus {multiUS}

R Documentation

Linear discriminant analysis

Description

The function performs a linear discriminant analysis (by using the MASS::lda function). Compared to the MASS::lda function, the ldaPlus function enable to consider the prior probabilities to predict the values of a categorical variable, it provides with predicted values and with (Jack-knife) classification table and also with statistical test of canonical correlations between the variable that represents groups and numeric variables.

Usage

ldaPlus(x, grouping, pred = TRUE, CV = TRUE, usePriorBetweenGroups = TRUE, ...)

Arguments

`x`	A data frame with values of numeric variables.
`grouping`	Categorical variable that defines groups.
`pred`	Whether to return the predicted values based on the model. Default is `TRUE`.
`CV`	Whether to do cross-validation in addition to "ordinary" analysis, default is `TRUE`.
`usePriorBetweenGroups`	Whether to use prior probabilities also in estimating the model (compared to only in prediction); default is `TRUE`.
`...`	Arguments passed to function `MASS::lda`.

Details

The specified prior is not taken into account when computing eigenvalues and all statistics based on them (everything in components eigModel and sigTest of the returned value).

Value

The following objects are also a part of what is returned by the MASS::lda function.

prior - Prior probabilities of class membership taken to estimate the model (it can be estimated based on the sample data or it can be provided by a reseacher).
counts - Number of units in each category of categorical variable taken to estimate the model.
means - Group means.
scaling - Matrix that transforms observations to discriminant functions, normalized so that within groups covariance matrix is spherical.
lev - Levels (groups) of the categorical variable.
svd - Singular values, that give the ratio of the between-group and within-group standard deviations on linear discriminant variables. Their squares are the canonical F-statistics.
N - Number of observations used.
call - the (matched) function call.

The additional following objects are generated by the multiUS::ldaPlus function.

standCoefWithin - Standardized coefficients (within groups) of discriminant function.
standCoefTotal - Standardized coefficients of discriminant function.
betweenGroupsWeights - Proportions/priors used when estimating the model.
sigTest - Test of canonical correlations between the variable that represent groups (binary variable) and numeric variables (see function testCC for more details) (Ho: The current and all the later canonical correlations equal to zero.).
eigModel - Table with eigenvalues and canonical correlations (see function testCC for more details).
centroids - Means of discriminant variables by levels of categorical variable (not predicted, but actual).
corr - Pooled correlations within groups (correlations between values of numerical variables and values of linear discriminat function(s)).
pred
- class - Predicted values of categorical variable
- posterior - Posterior probabilities (the values of the Fisher's calcification linear discrimination function)
- x - Estimated values of discriminat function(s) for each unit
class - Classification table:
- orgTab - Frequency table.
- perTab - Percentages.
- corPer - Percentage of correctly predicted values (alternatively, percentage of correctly classified units).
classCV - Similar to class but based on cross validation (Jack-knife).

Author(s)

Aleš Žiberna

References

R Data Analysis Examples: Canonical Correlation Analysis, UCLA: Statistical Consulting Group. From http://www.ats.ucla.edu/stat/r/dae/canonical.htm (accessed Decembar 27, 2013).

Examples

ldaPlus(x = mtcars[,c(1, 3, 4, 5, 6)], grouping = mtcars[,10])

[Package multiUS version 1.2.3 Index]