gbm.subset {gbm.auto} | R Documentation |
Subset gbm.auto input datasets to 2 groups using the partial deviance plots
Description
Set your working directory to the output folder of a gbm.auto/gbm.loop run. This function returns the variable value corresponding to the 0 value on the lineplots, which should be the optimal place to split the dataset into 2 subsets, low and high, IF the relationship doesn't cross 0 more than once. Function is similarly useful to quickly get the 0-point value in these cases, i.e. where values below are detrimental, values above beneficial (check plots though)
Usage
gbm.subset(x, fams = c("Bin", "Gaus"), loop = FALSE)
Arguments
x |
Vector of variable names. |
fams |
Vector of statistical data distribution family names to be modelled by gbm. |
loop |
Is the folder a gbm.loop output? |
Details
loop varnames are BinLineLoop_VAR.csv & GausLineLoop_VAR.csv normal varnames are Bin_Best_line_VAR.csv & Gaus_Best_line_VAR.csv
Just use average between the last negative & first positive point unless any points fall on zero
Value
a list of breakpoint values which datasets can be subsetted using.
Author(s)
Simon Dedman, simondedman@gmail.com
Examples
# Not run: requires completed gbm.auto run.
# having run gbm.auto (with linesfiles=TRUE), set working directory there
data(samples)
gbm.subset(x = names(samples[c(4:8, 10)]), fams = c("Bin", "Gaus"))