process_outliers {creditmodel}R Documentation

Outliers Treatment

Description

outliers_kmeans_lof is for outliers detection and treatment using Kmeans and Local Outlier Factor (lof) process_outliers is a simpler wrapper for outliers_kmeans_lof.

Usage

process_outliers(
  dat,
  target,
  ex_cols = NULL,
  kc = 3,
  kn = 5,
  x_list = NULL,
  parallel = FALSE,
  note = FALSE,
  process = TRUE,
  save_data = FALSE,
  file_name = NULL,
  dir_path = tempdir()
)

outliers_kmeans_lof(
  dat,
  x,
  target = NULL,
  kc = 3,
  kn = 5,
  note = FALSE,
  process = TRUE,
  save_data = FALSE,
  file_name = NULL,
  dir_path = tempdir()
)

Arguments

dat

Dataset with independent variables and target variable.

target

The name of target variable.

ex_cols

A list of excluded variables. Regular expressions can also be used to match variable names. Default is NULL.

kc

Number of clustering centers for Kmeans

kn

Number of neighbors for LOF.

x_list

Names of independent variables.

parallel

Logical, parallel computing.

note

Logical, outputs info. Default is TRUE.

process

Logical, process outliers, not just analysis.

save_data

Logical. If TRUE, save outliers analysis file to the specified folder at dir_path

file_name

The file name for periodically saved outliers analysis file. Default is NULL.

dir_path

The path for periodically saved outliers analysis file. Default is "./variable".

x

The name of variable to process.

Value

A data frame with outliers process to all the variables.

Examples

dat_out = process_outliers(UCICreditCard[1:10000,c(18:21,26)],
                        target = "default.payment.next.month",
                       ex_cols = "date$", kc = 3, kn = 10, 
                       parallel = FALSE,note = TRUE)

[Package creditmodel version 1.3.1 Index]