fit_dist {SEI}R Documentation

Fit a distribution to data

Description

Function to fit a specified distribution a vector of data. Returns the estimated distribution and relevant goodness-of-fit statistics.

Usage

fit_dist(data, dist, n_thres = 20)

Arguments

data

vector of data

dist

character string specifying the distribution, see details

n_thres

number of data points required to estimate the distribution

Details

This has been adapted from code available at https://github.com/WillemMaetens/standaRdized.

data is a numeric vector of data from which the distribution is to be estimated.

dist is the specified distribution to be fit to data. This must be one of 'empirical' (the empirical distribution given data), 'kde' (kernel density estimation), 'norm', 'lnorm', 'logis', 'llogis', 'exp', 'gamma', and 'weibull'.

By default, dist = "empirical", in which case the distribution is estimated empirically from data. This is only recommended if there are at least 100 values in data, and a warning message is returned otherwise.

n_thres is the minimum number of observations required to fit the distribution. The default is n_thres = 20. If the number of values in data is smaller than na_thres, an error is returned. This guards against over-fitting, which can result in distributions that do not generalise well out-of-sample.

Where relevant, parameter estimation is performed using maximum likelihood estimation.

Value

A list containing the estimated distribution function, its parameters, and Kolmogorov-Smirnov goodness-of-fit statistics.

Examples

N <- 1000
shape <- 3
rate <- 2


# gamma distribution
data <- rgamma(N, shape, rate)
out <- fit_dist(data, dist = "gamma")
hist(data, breaks = 30, probability = TRUE)
lines(seq(0, 10, 0.01), dgamma(seq(0, 10, 0.01), out$params[1], out$params[2]), col = "blue")


# weibull distribution
data <- rweibull(N, shape, 1/rate)
out <- fit_dist(data, dist = "weibull")
hist(data, breaks = 30, probability = TRUE)
lines(seq(0, 10, 0.01), dweibull(seq(0, 10, 0.01), out$params[1], out$params[2]), col = "blue")


[Package SEI version 0.1.1 Index]