TangledFeatures {TangledFeatures} | R Documentation |
The main TangledFeatures function
Description
The main TangledFeatures function
Usage
TangledFeatures(
Data,
Y_var,
Focus_variables = list(),
corr_cutoff = 0.7,
RF_coverage = 0.95,
plot = FALSE,
fast_calculation = FALSE,
cor1 = "pearson",
cor2 = "polychoric",
cor3 = "spearman"
)
Arguments
Data |
The imported Data Frame |
Y_var |
The dependent variable |
Focus_variables |
The list of variables that you wish to give a certain bias to in the correlation matrix |
corr_cutoff |
The correlation cutoff variable. Defaults to 0.8 |
RF_coverage |
The Random Forest coverage of explainable. Defaults to 95 percent |
plot |
Return if plotting is to be done. Binary True or False |
fast_calculation |
Returns variable list without many Random Forest iterations by simply picking a variable from a correlated group |
cor1 |
The correlation metric between two continuous features. Defaults to pearson correlation |
cor2 |
The correlation metric between one categorical feature and one continuous feature. Defaults to bi serial correlation correlation |
cor3 |
The correlation metric between two categorical features. Defaults to Cramer's V. |
Value
Returns a list of variables that are ready for future modelling, along with other metrics
Examples
TangledFeatures(Data = TangledFeatures::Advertisement, Y_var = 'Sales')