univariate {PDtoolkit}R Documentation

Univariate analysis

Description

univariate returns the univariate statistics for risk factors supplied in data frame db.
For numeric risk factors univariate report includes:

For categorical risk factors univariate report includes:

Usage

univariate(
  db,
  sc = c(NA, NaN, Inf, -Inf),
  sc.method = "together",
  sc.threshold = 0.2
)

Arguments

db

Data frame of risk factors supplied for univariate analysis.

sc

Vector of special case elements. Default values are c(NA, NaN, Inf).

sc.method

Define how special cases will be treated, all together or in separate bins. Possible values are "together", "separately".

sc.threshold

Threshold for special cases expressed as percentage of total number of observations. If sc.method is set to "separately", then percentage for each special case will be summed up.

Value

The command univariate returns the data frame with explained univariate metrics for numeric, character, factor and logical class of risk factors.

Examples

suppressMessages(library(PDtoolkit))
data(gcd)
gcd$age[100:120] <- NA
gcd$age.bin <- ndr.bin(x = gcd$age, y = gcd$qual, y.type = "bina")[[2]]
gcd$age.bin <- as.factor(gcd$age.bin)
gcd$maturity.bin <- ndr.bin(x = gcd$maturity, y = gcd$qual, y.type = "bina")[[2]]
gcd$amount.bin <- ndr.bin(x = gcd$amount, y = gcd$qual, y.type = "bina")[[2]]
gcd$all.miss1 <- NaN
gcd$all.miss2 <- NA
gcd$tf <- sample(c(TRUE, FALSE), nrow(gcd), rep = TRUE)
#create date variable to confirm that it will not be processed by the function
gcd$dates <- Sys.Date()
str(gcd)
univariate(db = gcd)

[Package PDtoolkit version 1.2.0 Index]