ExpCatStat {SmartEDA}R Documentation

Function provides summary statistics for all character or categorical columns in the dataframe

Description

This function combines results from weight of evidence, information value and summary statistics.

Usage

ExpCatStat(
  data,
  Target = NULL,
  result = "Stat",
  clim = 10,
  nlim = 10,
  bins = 10,
  Pclass = NULL,
  plot = FALSE,
  top = 20,
  Round = 2
)

Arguments

data

dataframe or matrix

Target

target variable

result

"Stat" - summary statistics, "IV" - information value

clim

maximum unique levles for categorical variable. Variables will be dropped if unique levels is higher than clim for class factor/character variable

nlim

maximum unique values for numeric variable.

bins

number of bins (default is 10)

Pclass

reference category of target variable

plot

Information value barplot (default FALSE)

top

for plotting top information values (default value is 20)

Round

round of value

Details

Criteria used for categorical variable predictive power classification are

Value

This function provides summary statistics for categorical variable

Columns description:

Author(s)

dubrangala

Examples

# Example 1
## Read mtcars data
# Target variable "am" - Transmission (0 = automatic, 1 = manual)
# Summary statistics
ExpCatStat(mtcars,Target="am",result = "Stat",clim=10,nlim=10,bins=10,
Pclass=1,plot=FALSE,top=20,Round=2)
# Information value plot
ExpCatStat(mtcars,Target="am",result = "Stat",clim=10,nlim=10,bins=10,
Pclass=1,plot=TRUE,top=20,Round=2)
# Information value for categorical Independent variables
ExpCatStat(mtcars,Target="am",result = "IV",clim=10,nlim=10,bins=10,
Pclass=1,plot=FALSE,top=20,Round=2)

[Package SmartEDA version 0.3.10 Index]