SSLRDecisionTree {SSLR} | R Documentation |
General Interface Decision Tree model
Description
Decision Tree is a simple and effective semi-supervised learning method. Based on the article "Semi-supervised classification trees". It also offers many parameters to modify the behavior of this method. It is the same as the traditional Decision Tree algorithm, but the difference is how the gini coefficient is calculated (classification). In regression we use SSE metric (different from the original investigation) It can be used in classification or regression. If Y is numeric is for regression, classification in another case
Usage
SSLRDecisionTree(
max_depth = 30,
w = 0.5,
min_samples_split = 20,
min_samples_leaf = ceiling(min_samples_split/3)
)
Arguments
max_depth |
A number from 1 to Inf. Is the maximum number of depth in Decision Tree Default is 30 |
w |
weight parameter ranging from 0 to 1. Default is 0.5 |
min_samples_split |
the minimum number of observations to do split. Default is 20 |
min_samples_leaf |
the minimum number of any terminal leaf node. Default is ceiling(min_samples_split/3) |
Details
In this model we can make predictions with prob type
References
Jurica Levati, Michelangelo Ceci, Dragi Kocev, Saso Dzeroski.
Semi-supervised classification trees.
Published online: 25 March 2017
© Springer Science Business Media New York 2017
Examples
library(tidyverse)
library(caret)
library(SSLR)
library(tidymodels)
data(wine)
set.seed(1)
train.index <- createDataPartition(wine$Wine, p = .7, list = FALSE)
train <- wine[ train.index,]
test <- wine[-train.index,]
cls <- which(colnames(wine) == "Wine")
#% LABELED
labeled.index <- createDataPartition(wine$Wine, p = .2, list = FALSE)
train[-labeled.index,cls] <- NA
m <- SSLRDecisionTree(min_samples_split = round(length(labeled.index) * 0.25),
w = 0.3,
) %>% fit(Wine ~ ., data = train)
#Accuracy
predict(m,test) %>%
bind_cols(test) %>%
metrics(truth = "Wine", estimate = .pred_class)
#For probabilities
predict(m,test, type = "prob")