Correlation {lessR}R Documentation

Correlation Analysis

Description

Abbreviation: cr, cr_brief

For two variables, yields the correlation coefficient with hypothesis test and confidence interval. For a data frame or list of variables from a data frame, yields the correlation matrix. The default computed coefficient(s) are the standard Pearson's product-moment correlation, with Spearman and Kendall coefficients available. For the default missing data technique of pairwise deletion, an analysis of missing data for each computed correlation coefficient is provided. For a correlation matrix a statistical summary of the missing data across all cells is provided.

Versions of this function from lessR 3.3 or earlier returned just a correlation matrix. Now other values are returned as well so that the correlation matrix is now stored as part of a returned list in R, directly available, for example, as mycor$R from mycor <- cr(d). This revision is automatically adjusted for in the lessR routines that read the subsequent correlation matrix, so all pre-existing code continues to work. That is, the input into any of these routines could be, for example, mycor, mycor$R or a stand-alone correlation matrix such as in pre-lessR 3.3.

Usage

Correlation(x, y, data=d,
         miss=c("pairwise", "listwise", "everything"),
         fill_low=NULL, fill_hi=NULL,
         show_n=NULL, brief=FALSE,
         digits_d=NULL, heat_map=TRUE,
         main=NULL, bottom=3, right=3,
         pdf=FALSE, width=5, height=5, ...)

cr_brief(..., brief=TRUE) 

cr(...) 

Arguments

x

First variable, or list of variables for a correlation matrix.

y

Second variable or not specified if the first argument is a list.

data

Optional data frame that contains the variables of interest, default is d.

miss

Basis for deleting missing data values_

fill_low

Starting color for a custom sequential palette.

fill_hi

Ending color for a custom sequential palette.

show_n

For pairwise deletion, show the matrix of sample sizes for each correlation coefficient, regardless of sample size.

brief

Pertains to a single correlation coefficient analysis. If FALSE, then the sample covariance and number of non-missing and missing observations are displayed.

digits_d

Specifies the number of decimal digits to display in the output.

heat_map

If TRUE, generate a heat map.

main

Graph title of heat map. Set to main="" to turn off.

bottom

Number of lines of bottom margin of heat map.

right

Number of lines of right margin of heat map.

pdf

If TRUE, generate the heat map and write to pdf files.

width

Width of the pdf file in inches.

height

Height of the pdf file in inches.

...

Other parameter values for internally called functions, which include method="spearman" and method="kendall" and also alternative="less" and alternative="more".

Details

When two variables are specified, both x and y, the output is the correlation coefficient with hypothesis test, for a null hypothesis of 0, and confidence interval. Also displays the sample covariance. Based on R functions cor, cor.test, cov.

In place of two variables x and y, x can be a complete data frame, either specified with the name of a data frame, or blank to rely upon the default data frame d. Or, x can be a list of variables from the input data frame. In these situations y is missing. Any non-numeric variables in the data frame or specified variable list are automatically deleted from the analysis.

When heat_map=TRUE, generate a heat map to standard graphics windows. Set pdf=TRUE to generate these graphics but have them directed to their respective pdf files.

For treating missing data, the default is pairwise, which means that an observation is deleted only for the computation of a specific correlation coefficient if one or both variables are missing the value for the relevant variable(s). For listwise deletion, the entire observation is deleted from the analysis if any of its data values are missing. For the more extreme everything option, any missing data values for a variable result in all correlations for that variable reported as missing.

Value

From versions of lessR of 3.3 and earlier, if a correlation matrix is computed, the matrix is returned. Now more values are returned, so the matrix is embedded in a list of returned elements.

READABLE OUTPUT

single coefficient
out_background: Variables in the model, any variable labels
out_describe: Estimated coefficients
out_inference: Hypothesis test and confidence interval estimated coefficient

matrix
out_background: Variables in the model, any variable labels
out_missing: Missing values analysis
out_cor: Correlations

STATISTICS

single coefficient
r: Model formula that specifies the model
tvalue: t-statistic of estimated value of null hypothesis of no relationship
df: Degrees of freedom of hypothesis test pvalue: Number of rows of data submitted for analysis
lb: Lower bound of confidence interval
ub: Upper bound of confidence interval

matrix
R: Correlations

Usually assign the name of mycor to the output matrix, as in following examples. This matrix is ready for input into any of the lessR functions that analyze correlational data, including confirmatory factor analysis by corCFA and also exploratory factor analysis, either the standard R function factanal or the lessR function corEFA

Author(s)

David W. Gerbing (Portland State University; gerbing@pdx.edu)

References

Gerbing, D. W. (2023). R Data Analysis without Programming: Explanation and Interpretation, 2nd edition, Chapter 10, NY: Routledge.

See Also

cor.test, cov.

Examples

# data
n <- 12
f <- sample(c("Group1","Group2"), size=n, replace=TRUE)
x1 <- round(rnorm(n=n, mean=50, sd=10), 2)
x2 <- round(rnorm(n=n, mean=50, sd=10), 2)
x3 <- round(rnorm(n=n, mean=50, sd=10), 2)
x4 <- round(rnorm(n=n, mean=50, sd=10), 2)
d <- data.frame(f,x1, x2, x3, x4)
rm(f); rm(x1); rm(x2); rm(x3); rm(x4)

# correlation and covariance
Correlation(x1, x2)
# short name
cr(x1, x2)
# brief form of output
cr_brief(x1, x2)

# Spearman rank correlation, one-sided test
Correlation(x1, x2, method="spearman", alternative="less")

# correlation matrix of the numerical variables in mycor
mycor <- Correlation()

# correlation matrix of Kendall's tau coefficients
mycor <- cr(method="kendall")

# correlation matrix of specified variables in mycor with heat_map
mycor <- Correlation(x1:x3, heat_map=TRUE)

# analysis with data not from data frame mycor
data(attitude)
mycor <- Correlation(rating, learning, data=attitude)

# analysis of entire data frame that is not mycor
data(attitude)
mycor <- Correlation(attitude)

[Package lessR version 4.3.6 Index]