R: Create Logit Leaf Model

llm {LLM}

R Documentation

Create Logit Leaf Model

Description

This function creates the logit leaf model. It takes a dataframe with numeric values as input and a corresponding vector with dependent values. Decision tree parameters threshold for pruning and number of observations per leaf can be set.

Usage

llm(X, Y, threshold_pruning = 0.25, nbr_obs_leaf = 100)

Arguments

`X`	Dataframe containing numerical independent variables.
`Y`	Numerical vector of dependent variable. Currently only binary classification is supported.
`threshold_pruning`	Set confidence threshold for pruning. Default 0.25.
`nbr_obs_leaf`	The minimum number of observations in a leaf node. Default 100.

Value

An object of class logitleafmodel, which is a list with the following components:

`Segment Rules`	The decision rules that define segments. Use `table.llm.html` to visualize.
`Coefficients`	The segment specific logistic regression coefficients. Use `table.llm.html` to visualize.
`Full decision tree for segmentation`	The raw decision tree. Use `table.llm.html` to visualize.
`Observations per segment`	The raw decision tree. Use `table.llm.html` to visualize.
`Incidence of dependent per segment`	The raw decision tree. Use `table.llm.html` to visualize.

Author(s)

Arno De Caigny, a.de-caigny@ieseg.fr, Kristof Coussement, k.coussement@ieseg.fr and Koen W. De Bock, kdebock@audencia.com

References

Arno De Caigny, Kristof Coussement, Koen W. De Bock, A New Hybrid Classification Algorithm for Customer Churn Prediction Based on Logistic Regression and Decision Trees, European Journal of Operational Research (2018), doi: 10.1016/j.ejor.2018.02.009.

Examples

## Load PimaIndiansDiabetes dataset from mlbench package
if (requireNamespace("mlbench", quietly = TRUE)) {
  library("mlbench")
}
data("PimaIndiansDiabetes")
## Split in training and test (2/3 - 1/3)
idtrain <- c(sample(1:768,512))
PimaTrain <-PimaIndiansDiabetes[idtrain,]
Pimatest <-PimaIndiansDiabetes[-idtrain,]
## Create the LLM
Pima.llm <- llm(X = PimaTrain[,-c(9)],Y = PimaTrain$diabetes,
 threshold_pruning = 0.25,nbr_obs_leaf = 100)

[Package LLM version 1.1.0 Index]