process_outliers {creditmodel} | R Documentation |
Outliers Treatment
Description
outliers_kmeans_lof
is for outliers detection and treatment using Kmeans and Local Outlier Factor (lof)
process_outliers
is a simpler wrapper for outliers_kmeans_lof
.
Usage
process_outliers(
dat,
target,
ex_cols = NULL,
kc = 3,
kn = 5,
x_list = NULL,
parallel = FALSE,
note = FALSE,
process = TRUE,
save_data = FALSE,
file_name = NULL,
dir_path = tempdir()
)
outliers_kmeans_lof(
dat,
x,
target = NULL,
kc = 3,
kn = 5,
note = FALSE,
process = TRUE,
save_data = FALSE,
file_name = NULL,
dir_path = tempdir()
)
Arguments
dat |
Dataset with independent variables and target variable. |
target |
The name of target variable. |
ex_cols |
A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL. |
kc |
Number of clustering centers for Kmeans |
kn |
Number of neighbors for LOF. |
x_list |
Names of independent variables. |
parallel |
Logical, parallel computing. |
note |
Logical, outputs info. Default is TRUE. |
process |
Logical, process outliers, not just analysis. |
save_data |
Logical. If TRUE, save outliers analysis file to the specified folder at |
file_name |
The file name for periodically saved outliers analysis file. Default is NULL. |
dir_path |
The path for periodically saved outliers analysis file. Default is "./variable". |
x |
The name of variable to process. |
Value
A data frame with outliers process to all the variables.
Examples
dat_out = process_outliers(UCICreditCard[1:10000,c(18:21,26)],
target = "default.payment.next.month",
ex_cols = "date$", kc = 3, kn = 10,
parallel = FALSE,note = TRUE)