R: Descriptive statistics

desc.stat {monobinShiny}

R Documentation

Descriptive statistics

Description

desc.stat returns the descriptive statistics of numeric risk factor. Reported metrics covers mainly univariate and part of bivariate analysis which are usually standard steps in credit rating model development. Metrics are reported for special (if exists) and complete case groups separately. Report includes:

risk.factor: Risk factor name.
type: Special case or complete case group.
bin: When special case method is together then bin is the same as type, otherwise all special cases are reported separately.
cnt: Number of observations.
pct: Percentage of observations.
min: Minimum value.
p1, p5, p25, p50, p75, p95, p99: Percentile values.
avg: Mean value.
avg.se: Standard error of mean.
max: Maximum value.
neg: Number of negative values.
pos: Number of positive values.
cnt.outliers: Number of outliers. Records above and below Q75 + 1.5 * IQR, where IQR = Q75 - Q25, where IQR is interquartile range.

Usage

desc.stat(x, y, sc = c(NA, NaN, Inf), sc.method = "together")

Arguments

`x`	Numeric risk factor.
`y`	Numeric target vector (binary or continuous).
`sc`	Numeric vector with special case elements. Default values are c(NA, NaN, Inf). Recommendation is to keep the default values always and add new ones if needed. Otherwise, if these values exist in x and are not defined in the sc vector, function will report the error.
`sc.method`	Define how special cases will be treated, all together or in separate bins. Possible values are `"together", "separately"`.

Value

Data frame of descriptive statistics metrics, separately for complete and special case groups.

Examples

suppressMessages(library(monobinShiny))
data(gcd)
desc.stat(x = gcd$age, y = gcd$qual)
gcd$age[1:10] <- NA
gcd$age[50:75] <- Inf
desc.stat(x = gcd$age, y = gcd$qual, sc.method = "together")
desc.stat(x = gcd$age, y = gcd$qual, sc.method = "separately")

[Package monobinShiny version 0.1.0 Index]