association {analyzer} | R Documentation |
Find association between variables
Description
association
finds association among all the variables in the data.
Usage
association(
tb,
categorical = NULL,
method1 = c("auto", "pearson", "kendall", "spearman"),
method3 = c("auto", "parametric", "non-parametric"),
methodMats = NULL,
use = "everything",
normality_test_method = c("ks", "anderson", "shapiro"),
normality_test_pval = 0.05,
...
)
Arguments
tb |
tabular data |
categorical |
a vector specifying the names of categorical (character, factor) columns |
method1 |
method for association between continuous-continuous
variables. values can be |
method3 |
method for association between continuous-categorical
variables. Values can be |
methodMats |
This parameter can be used to define the methods for
calculating correlation and association at variables pair level. The input is
a square data.frame of dimension - number of columns in
Default is NULL. In that case the method used for calculating correlation and association will be the inputs from parameters. This parameter can also tale some other values. See example for more details. But its advisable to use like mentioned above. |
use |
an optional character string giving a method for computing association in the presence of missing values. This must be (complete or an abbreviation of) one of the strings "everything", "all.obs", "complete.obs", "na.or.complete", or "pairwise.complete.obs". If use is "everything", NAs will propagate conceptually, i.e., a resulting value will be NA whenever one of its contributing observations is NA. If use is "all.obs", then the presence of missing observations will produce an error. If use is "complete.obs" then missing values are handled by case wise deletion (and if there are no complete cases, that gives an error). "na.or.complete" is the same unless there are no complete cases, that gives NA |
normality_test_method |
method for normality test for a variable.
Values can be |
normality_test_pval |
significance level for normality tests. Default is 0.05 |
... |
other parameters passed to |
Details
This function calculates association value in three categories -
between continuous variables (using
CCassociation
function)between categorical variables (using
QQassociation
function)between continuous and categorical variables (using
CQassociation
function)
For more details, look at the individual documentation of
CCassociation
, QQassociation
,
CQassociation
Value
A list of three tables:
- continuous_corr
correlation among all the continuous variables
- continuous_pvalue
Table containing p-value for the correlation test
- categorical_cramers
Cramer's V value among all the categorical variables
- categorical_pvalue
Chi Sq test p-value
- continuous_categorical
association value among continuous and categorical variables
- method_used
A data.frome showing the method used for all pairs of variables
See Also
CCassociation
for Correlation between Continuous variables,
QQassociation
for Association between Categorical variables,
CQassociation
for Association between Continuous-Categorical
variables
Examples
tb <- mtcars
tb$cyl <- as.factor(tb$cyl)
tb$vs <- as.factor(tb$vs)
out <- association(tb, categorical = c("cyl", "vs"))
# To use the methodMats parameter, create a matrix like this
methodMats <- out$method_used
# the values can be changed as per requirement
# NOTE: in addition to the values from parameters method1 and method3,
# the values in methodMats can also be the values returned by
# association function. But its advisable to use the options from
# method1 and method3 arguements
methodMats["mpg", "disp"] <- methodMats["disp", "mpg"] <- "spearman"
out <- association(tb, categorical = c("cyl", "vs"), methodMats = methodMats)
rm(tb)