process_truncate_by_iqr {simdata}R Documentation

Truncate columns of datamatrix at datamatrix specific thresholds

Description

Truncation based on the interquartile range to be applied to a dataset.

Usage

process_truncate_by_iqr(x, truncate_multipliers = NA, only_numeric = TRUE)

Arguments

x

Matrix or Data.frame.

truncate_multipliers

Vector of truncation parameters. Either a single value which is replicated as necessary or of same dimension as ncol(x). If any vector entry is NA, the corresponding column will not be truncated. If named, then the names must correspond to columnnames in x, and only specified columns will be processed. See details.

only_numeric

If TRUE and if x is a data.frame, then only columns of type numeric will be processed. Otherwise all columns will be processed (e.g. also in the case that x is a matrix).

Details

Truncation is processed as follows:

  1. Compute the 1st and 3rd quartile q1 / q3 of variables in x.

  2. Multiply these quantities by values in truncate_multipliers to obtain L and U. If a value is NA, the corresponding variable is not truncated.

  3. Set any value smaller / larger than L / U to L / U.

Truncation multipliers can be specified in three ways (note that whenever only_numeric is set to TRUE, then only numeric columns are affected):

Value

Matrix or data.frame of same dimensions as input.


[Package simdata version 0.4.0 Index]