PrInDTAllparts {PrInDT}R Documentation

Conditional inference trees (ctrees) based on consecutive parts of the full sample

Description

ctrees based on the full sample of the smaller class and consecutive parts of the larger class of the nesting variable 'nesvar'. The variable 'nesvar' has to be part of the data frame 'datain'.
Interpretability is checked (see 'ctestv'); probability threshold can be specified.

Reference
Weihs, C., Buschfeld, S. 2021b. NesPrInDT: Nested undersampling in PrInDT. arXiv:2103.14931

Usage

PrInDTAllparts(datain, classname, ctestv=NA, conf.level=0.95, thres=0.5,
       nesvar, divt)

Arguments

datain

Input data frame with class factor variable 'classname' and the
influential variables, which need to be factors or numericals (transform logicals and character variables to factors)

classname

Name of class variable (character)

ctestv

Vector of character strings of forbidden split results;
see function PrInDT for details.
If no restrictions exist, the default = NA is used.

conf.level

(1 - significance level) in function ctree (numerical, > 0 and <= 1); default = 0.95

thres

Probability threshold for prediction of smaller class (numerical, >= 0 and < 1); default = 0.5

nesvar

Name of nesting variable (character)

divt

Number of parts of nesting variable nesvar for which models should be determined individually

Details

Standard output can be produced by means of print(name) or just name where 'name' is the output data frame of the function.

Value

baAll

balanced accuracy of tree on full sample

nesvar

name of nesting variable

divt

number of consecutive parts of the sample

badiv

balanced accuracy of trees on 'divt' consecutive parts of the sample

Examples

data <- PrInDT::data_speaker
data <- na.omit(data)
nesvar <- "SPEAKER"
outNesAll <- PrInDTAllparts(data,"class",ctestv=NA,conf.level=0.95,thres=0.5,nesvar,divt=8)
outNesAll


[Package PrInDT version 1.0.1 Index]