missing_val {scorecardModelUtils}R Documentation

Missing value imputation

Description

The function imputes the missing value in the input dataset. For numerical variables, missing values can be replaced by four possible method - 1. "mean" - mean or simple average of the non-missing values ; 2. - "median" - median or the 50th percentile of the non-missing values; 3. "mode"- mode or the value with maximum frequency among the non-mising values; 4. special extreme value of users' choice to be passes as an argument (-99999 is the default value). For categorical value, missing class can be replaced by two possible methods - 1. "mode" - mode or the class with maximum frequency among the non-mising values; 2. special class of users' choice to be passes as an argument ("missing_value" is the default class). The target column will remain unchanged.

Usage

missing_val(base, target, num_missing = -99999,
  cat_missing = "missing_value")

Arguments

base

input dataframe

target

column/field name of the target variable, to be passed as a string

num_missing

(optional) method for replacing missing values for numerical type fields - to be chosen between "mean", "median", "mode" or a value of users' choice (default value is -99999)

cat_missing

(optional) method for replacing missing values for categorical type fields - to be chosen between "mode" or a class of users' choice (default value is "missing_value")

Value

The function returns an object of class "missing_val" which is a list containing the following components:

base

a dataframe after imputing missing values

mapping_table

a dataframe with mapping between original variable and imputed missing value (if any)

Author(s)

Arya Poddar <aryapoddar290990@gmail.com>

Examples

data <- iris
data$Species <- as.character(data$Species)
data$Y <- sample(0:1,size=nrow(data),replace=TRUE)
data[sample(1:nrow(data),size=25),"Sepal.Length"] <- NA
data[sample(1:nrow(data),size=10),"Species"] <- NA

missing_list <- missing_val(base = data,target = "Y")
missing_list$base
missing_list$mapping_table

[Package scorecardModelUtils version 0.0.1.0 Index]