correlate_df {popstudy}R Documentation

correlate_df

Description

Compute correlations in a data frames.

Usage

correlate_df(data, keep_class = NULL)

Arguments

data

data.frame. A dataset with the variables to correlate.

keep_class

list. A list that contains desire classes for specyfic variables.

Details

correlate_df takes data.frame class objects and works only with numeric, factor, and ordered class variables, so a previous data cleaning is needed for optimal results. A variable is considered nominal when it is a factor variable with more than two levels, and it is no ordered. When a numeric variable has only two different values, it is considered a binary variable. Also, when a factor variable has only two levels, it is regarded as a binary variable. The computed correlation will depend on the paired-variables class: Pearson method when both variables are numeric, Kendall correlation with a numeric and an ordinal variable, point-biserial with a numeric and a binary variable, Polychoric correlation with two ordinal variables, Tetrachoric correlation when both are binary, Rank-Biserial when one is ordinal, and the other is binary; and Kruskal's Lambda with one binary and one nominal, or both nominal variables. A Gaussian linear model is fitted to estimate the multiple correlation coefficient in the specific cases of one nominal variable and another numerical or ordered, so the user should take it carefully.

Value

correlate_df function returns a list with three objects: A data-frame with the correlation matrix and two correlation plots.

Author(s)

Cesar Gamboa-Sanabria

References

Khamis H (2008). “Measures of Association: How to Choose?” Journal of Diagnostic Medical Sonography, 24(3), 155-162. doi:10.1177/8756479308317006.

Examples


df <- data.frame(cont1=rnorm(100),
cont2=rnorm(100),
ordi1=factor(sample(1:5, 100, replace = TRUE), ordered = TRUE),
ordi2=factor(sample(1:7, 100, replace = TRUE), ordered = TRUE),
bin1=rbinom(100, 1, .4),
bin2=rbinom(100, 1, .6),
nomi1=factor(sample(letters[1:8], 100, replace = TRUE)),
nomi2=factor(sample(LETTERS[1:8], 100, replace = TRUE)))

correlate_df(df)


[Package popstudy version 1.0.1 Index]