impute {cleandata}R Documentation

Impute Missing Values

Description

impute_mode: Impute NAs by the modes of their corresponding columns.

impute_median: Impute NAs by the medians of their corresponding columns.

impute_mean: Impute NAs by the means of their corresponding columns.

Usage

impute_mode(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

impute_median(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

impute_mean(x,cols=colnames(x),idx=row.names(x),log = eval.parent(in_log_default))

Arguments

x

The data frame to be imputed.

cols

The index of columns of x to be imputed.

idx

The index of rows of x to be used to calculate the values to impute NAs. Use this parameter to prevent leakage.

log

Controls log files. To produce log files, assign it or the log_arg variable in the parent environment (dynamic scope) a list of arguments for sink(), such as file, append, and split.

Value

An imputed data frame.

See Also

inspect_map, sink

Examples

# refer to vignettes if you want to use log files
message('refer to vignettes if you want to use log files')

# building a data frame
A <- as.factor(c('y', 'x', 'x', 'y', 'z'))
B <- c(6, 3:6)
C <- 1:5
df <- data.frame(A, B, C)
df[3, 1] <- NA; df[2, 2] <- NA; df [5, 3] <- NA
print(df)

# imputation
df0 <- impute_mode(df, cols = 1:3)
print(df0)
df0 <- impute_mode(df, cols = 1:3, idx = 1:3)
print(df0)
df0 <- impute_median(df, cols = 2:3)
print(df0)
df0 <- impute_mean(df, cols = 2:3)
print(df0)

[Package cleandata version 0.3.0 Index]