ExpInfoValue {SmartEDA} | R Documentation |
Information value
Description
Provides information value for each categorical variable (X) against target variable (Y)
Usage
ExpInfoValue(X, Y, valueOfGood = NULL)
Arguments
X |
Independent categorical variable. |
Y |
Binary response variable, it can take values of either 1 or 0. |
valueOfGood |
Value of Y that is used as reference category. |
Details
Information value is one of the most useful technique to select important variables in a predictive model. It helps to rank variables on the basis of their importance. The IV is calculated using the following formula
-
IV
- (Percentage of Good event - Percentage of Bad event) * WOE, where WOE is weight of evidence -
WOE
- log(Percentage of Good event - Percentage of Bad event)
Here is what the values of IV mean according to Siddiqi (2006)
-
If information value is < 0.03
then predictive power = "Not Predictive" -
If information value is 0.03 to 0.1
then predictive power = "Somewhat Predictive" -
If information value is 0.1 to 0.3
then predictive power = "Meidum Predictive" -
If information value is >0.3
then predictive power = "Highly Predictive"
Value
Information value (iv) and Predictive power class
-
information
value -
predictive
class
Examples
X = mtcars$gear
Y = mtcars$am
ExpInfoValue(X,Y,valueOfGood = 1)