tidy_distribution_comparison {TidyDensity} | R Documentation |
Compare Empirical Data to Distributions
Description
Compare some empirical data set against different distributions to help find the distribution that could be the best fit.
Usage
tidy_distribution_comparison(
.x,
.distribution_type = "continuous",
.round_to_place = 3
)
Arguments
.x |
The data set being passed to the function |
.distribution_type |
What kind of data is it, can be one of |
.round_to_place |
How many decimal places should the parameter estimates be rounded off to for distibution construction. The default is 3 |
Details
The purpose of this function is to take some data set provided and
to try to find a distribution that may fit the best. A parameter of
.distribution_type
must be set to either continuous
or discrete
in order
for this the function to try the appropriate types of distributions.
The following distributions are used:
Continuous:
tidy_beta
tidy_cauchy
tidy_chisquare
tidy_exponential
tidy_gamma
tidy_logistic
tidy_lognormal
tidy_normal
tidy_pareto
tidy_uniform
tidy_weibull
Discrete:
tidy_binomial
tidy_geometric
tidy_hypergeometric
tidy_poisson
The function itself returns a list output of tibbles. Here are the tibbles that are returned:
comparison_tbl
deviance_tbl
total_deviance_tbl
aic_tbl
kolmogorov_smirnov_tbl
multi_metric_tbl
The comparison_tbl
is a long tibble
that lists the values of the density
function against the given data.
The deviance_tbl
and the total_deviance_tbl
just give the simple difference
from the actual density to the estimated density for the given estimated distribution.
The aic_tbl
will provide the AIC
for liklehood of the distribution.
The kolmogorov_smirnov_tbl
for now provides a two.sided
estimate of the
ks.test
of the estimated density against the empirical.
The multi_metric_tbl
will summarise all of these metrics into a single tibble.
Value
An invisible list object. A tibble is printed.
Author(s)
Steven P. Sanderson II, MPH
Examples
xc <- mtcars$mpg
output_c <- tidy_distribution_comparison(xc, "continuous")
xd <- trunc(xc)
output_d <- tidy_distribution_comparison(xd, "discrete")
output_c
output_d