WOEclust_kmeans {Rprofet} | R Documentation |
Kmeans Variable Clustering
Description
Function that implements kmeans variable clusteting to be used as a form of variable selection.
Usage
WOEclust_kmeans(object, id, target, num_clusts)
Arguments
object |
A WOEProfet object containing dataframes with binned and WOE values. |
id |
ID variable. |
target |
A binary target variable. |
num_clusts |
Number of desired clusters. |
Value
A dataframe with the name of all the variables to be clustered, the corresponding cluster and the information value for each variable.
Examples
mydata <- ISLR::Default
mydata$ID = seq(1:nrow(mydata)) ## make the ID variable
mydata$default<-ifelse(mydata$default=="Yes",1,0) ## Creating numeric binary target variable
## create two new variables from bivariate normal
sigma <- matrix(c(45000,-3000,-3000, 55000), nrow = 2)
set.seed(10)
newvars <- MASS::mvrnorm(nrow(mydata),
mu=c(1000,200), Sigma=sigma)
mydata$newvar1 <- newvars[,1]
mydata$newvar2 <- newvars[,2]
binned <- BinProfet(mydata, id= "ID", target= "default", num.bins = 5) ## Binning variables
WOE_dat <- WOEProfet(binned, "ID","default")
## Cluster variables by WOEClust_kmeans
clusters <- WOEclust_kmeans(WOE_dat, id="ID", target="default", num_clusts=3)
clusters
[Package Rprofet version 3.1.1 Index]