Distribution.df {EnvStats} | R Documentation |
Data Frame Summarizing Available Probability Distributions and Estimation Methods
Description
Data frame summarizing information about available probability distributions in R and the EnvStats package, and which distributions have associated functions for estimating distribution parameters.
Usage
Distribution.df
Format
A data frame with 35 rows corresponding to 35 different available probability distributions, and 25 columns containing information associated with these probability distributions.
Name
a character vector containing the name of the probability distribution (see the column labeled Name in the table below).
Type
a character vector indicating the type of distribution (see the column labeled Type in the table below). Possible values are
"Finite Discrete"
,"Discrete"
,"Continuous"
, and"Mixed"
.Support.Min
a character vector indicating the minimum value the random variable can assume (see the column labeled Range in the table below). The reason this is a character vector instead of a numeric vector is because some distributions have a lower bound that depends on the value of a distribution parameter. For example, the minimum value for a Uniform distribution is given by the value of the parameter
min
.Support.Max
a character vector indicating the maximum value the random variable can assume (see the column labeled Range in the table below). The reason this is a character vector instead of a numeric vector is because some distributions have an upper bound that depends on the value of a distribution parameter. For example, the maximum value for a Uniform distribution is given by the value of the parameter
max
.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution parameter(s) (see the column labeled Estimation Method(s) in the table below). Possible values include
"mle"
(maximum likelihood),"mme"
(method of moments),"mmue"
(method of moments based on the unbiased estimate of variance),"mvue"
(minimum variance unbiased),"qmle"
(quasi-mle), etc., or some combination of these. In cases where an estimator is more than one kind, a slash (/
) is used to denote all methods covered by the single estimator. For example, for the Binomial distribution, the sample proportion is the maximum likelihood, method of moments, and minimum variance unbiased estimator, so this method is denoted as"mle/mme/mvue"
. See the help files for the specific function listed under Estimating Distribution Parameters for an explanation of each of these estimation methods.Quantile.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution quantiles. For many distributions, these are the same as
Estimation.Method(s)
. See the help files for the specific function listed under Estimating Distribution Quantiles for an explanation of each of these estimation methods.Prediction.Interval.Method(s)
a character vector indicating the names of the methods available to create prediction intervals. See the help files for the specific function listed under Prediction Intervals for an explanation of each of these estimation methods.
Singly.Censored.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I singly-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.
Multiply.Censored.Estimation.Method(s)
a character vector indicating the names of the methods available to estimate the distribution parameter(s) for Type I multiply-censored data. See the help files for the specific function listed under Estimating Distribution Parameters in the help file for Censored Data for an explanation of each of these estimation methods.
Number.parameters
a numeric vector indicating the number of parameters associated with the distribution (see the column labeled Parameters in the table below).
Parameter.1
the columns labeled
Parameter.1
,Parameter.2
, ...,Parameter.5
are character vectors containing the names of the distribution parameters (see the column labeled Parameters in the table below). If a distribution hasn
parameters andn < 5
, then the columns labeledParameter.n+1
, ...,Parameter.5
are empty. For example, the Normal distribution has only two parameters associated with it (mean
andsd
), so the fields inParameter.3
,Parameter.4
, andParameter.5
are empty.Parameter.2
see
Parameter.1
Parameter.3
see
Parameter.1
Parameter.4
see
Parameter.1
Parameter.5
see
Parameter.1
Parameter.1.Min
the columns labeled
Parameter.1.Min
,Parameter.2.Min
, ...,
Parameter.5.Min
are character vectors containing the minimum values that can be assumed by the distribution parameters (see the column labeled Parameter Range(s) in the table below).The reason these are character vectors instead of numeric vectors is because some parameters have a lower bound of
0
but must be strictly bigger than0
(e.g., the parametersd
for the Normal distribution), in which case the lower bound is.Machine$double.eps
, which may vary from machine to machine. Also, some parameters have a lower bound that depends on the value of another parameter. For example, the parametermax
for a Uniform distribution is bounded below by the value of the parametermin
.If a distribution has
n
parameters andn < 5
, then the columns labeledParameter.n+1.Min
, ...,Parameter.5.Min
have the missing value code (NA
). For example, the Normal distribution has only two parameters associated with it (mean
andsd
) so the fields in
Parameter.3.Min
,Parameter.4.Min
, andParameter.5.Min
haveNA
s in them.Parameter.2.Min
see
Parameter.1.Min
Parameter.3.Min
see
Parameter.1.Min
Parameter.4.Min
see
Parameter.1.Min
Parameter.5.Min
see
Parameter.1.Min
Parameter.1.Max
the columns labeled
Parameter.1.Max
,Parameter.2.Max
, ...,
Parameter.5.Max
are character vectors containing the maximum values that can be assumed by the distribution parameters (see the column labeled Parameter Range(s) in the table below).The reason these are character vectors instead of numeric vectors is because some parameters have an upper bound that depends on the value of another parameter. For example, the parameter
min
for a Uniform distribution is bounded above by the value of the parametermax
.If a distribution has
n
parameters andn < 5
, then the columns labeledParameter.n+1.Max
, ...,Parameter.5.Max
have the missing value code (NA
). For example, the Normal distribution has only two parameters associated with it (mean
andsd
) so the fields in
Parameter.3.Max
,Parameter.4.Max
, andParameter.5.Max
haveNA
s in them.Parameter.2.Max
see
Parameter.1.Max
Parameter.3.Max
see
Parameter.1.Max
Parameter.4.Max
see
Parameter.1.Max
Parameter.5.Max
see
Parameter.1.Max
Details
The table below summarizes the probability distributions available in
R and EnvStats. For each distribution, there are four
associated functions for computing density values, percentiles, quantiles,
and random numbers. The form of the names of these functions are
d
abb, p
abb, q
abb, and
r
abb, where abb is the abbreviated name of the
distribution (see table below). These functions are described in the
help file with the name of the distribution (see the first column of the
table below). For example, the help file for Beta describes the
behavior of dbeta
, pbeta
, qbeta
,
and rbeta
.
For most distributions, there is also an associated function for
estimating the distribution parameters, and the form of the names of
these functions is e
abb, where abb is the
abbreviated name of the distribution (see table below). All of these
functions are listed in the help file
Estimating Distribution Parameters. For example,
the function ebeta
estimates the shape parameters of a
Beta distribution based on a random sample of observations from
this distribution.
For some distributions, there are functions to estimate distribution
parameters based on Type I censored data. The form of the names of
these functions is e
abbSinglyCensored
for
singly censored data and e
abbMultiplyCensored
for
multiply censored data. All of these functions are listed under the heading
Estimating Distribution Parameters in the help file
Censored Data.
Table 1a. Available Distributions: Name, Abbreviation, Type, and Range
Name | Abbreviation | Type | Range |
Beta | beta | Continuous | [0, 1] |
Binomial | binom | Finite | [0, size] |
Discrete | (integer) | ||
Cauchy | cauchy | Continuous | (-\infty, \infty) |
Chi | chi | Continuous | [0, \infty) |
Chi-square | chisq | Continuous | [0, \infty) |
Exponential | exp | Continuous | [0, \infty) |
Extreme | evd | Continuous | (-\infty, \infty) |
Value | |||
F | f | Continuous | [0, \infty) |
Gamma | gamma | Continuous | [0, \infty) |
Gamma | gammaAlt | Continuous | [0, \infty) |
(Alternative) | |||
Generalized | gevd | Continuous | (-\infty, \infty) |
Extreme | for shape = 0 |
||
Value | |||
(-\infty, location + \frac{scale}{shape}] |
|||
for shape > 0 |
|||
[location + \frac{scale}{shape}, \infty) |
|||
for shape < 0 |
|||
Geometric | geom | Discrete | [0, \infty) |
(integer) | |||
Hypergeometric | hyper | Finite | [0, min(k,m)] |
Discrete | (integer) | ||
Logistic | logis | Continuous | (-\infty, \infty) |
Lognormal | lnorm | Continuous | [0, \infty) |
Lognormal | lnormAlt | Continuous | [0, \infty) |
(Alternative) | |||
Lognormal | lnormMix | Continuous | [0, \infty) |
Mixture | |||
Lognormal | lnormMixAlt | Continuous | [0, \infty) |
Mixture | |||
(Alternative) | |||
Three- | lnorm3 | Continuous | [threshold, \infty) |
Parameter | |||
Lognormal | |||
Truncated | lnormTrunc | Continuous | [min, max] |
Lognormal | |||
Truncated | lnormTruncAlt | Continuous | [min, max] |
Lognormal | |||
(Alternative) | |||
Negative | nbinom | Discrete | [0, \infty) |
Binomial | (integer) | ||
Normal | norm | Continuous | (-\infty, \infty) |
Normal | normMix | Continuous | (-\infty, \infty) |
Mixture | |||
Truncated | normTrunc | Continuous | [min, max] |
Normal | |||
Pareto | pareto | Continuous | [location, \infty) |
Poisson | pois | Discrete | [0, \infty) |
(integer) | |||
Student's t | t | Continuous | (-\infty, \infty) |
Triangular | tri | Continuous | [min, max] |
Uniform | unif | Continuous | [min, max] |
Weibull | weibull | Continuous | [0, \infty) |
Wilcoxon | wilcox | Finite | [0, m n] |
Rank Sum | Discrete | (integer) | |
Zero-Modified | zmlnorm | Mixed | [0, \infty) |
Lognormal | |||
(Delta) | |||
Zero-Modified | zmlnormAlt | Mixed | [0, \infty) |
Lognormal | |||
(Delta) | |||
(Alternative) | |||
Zero-Modified | zmnorm | Mixed | (-\infty, \infty) |
Normal | |||
Table 1b. Available Distributions: Name, Parameters, Parameter Default Values, Parameter Ranges, Estimation Method(s)
Default | Parameter | Estimation | ||
Name | Parameter(s) | Value(s) | Range(s) | Method(s) |
Beta | shape1 | (0, \infty) | mle, mme, mmue | |
shape2 | (0, \infty) | |||
ncp | 0 | (0, \infty) | ||
Binomial | size | [0, \infty) | mle/mme/mvue | |
prob | [0, 1] | |||
Cauchy | location | 0 | (-\infty, \infty) | |
scale | 1 | (0, \infty) | ||
Chi | df | (0, \infty) | ||
Chi-square | df | (0, \infty) | ||
ncp | 0 | (-\infty, \infty) | ||
Exponential | rate | 1 | (0, \infty) | mle/mme |
Extreme | location | 0 | (-\infty, \infty) | mle, mme, mmue, pwme |
Value | scale | 1 | (0, \infty) | |
F | df1 | (0, \infty) | ||
df2 | (0, \infty) | |||
ncp | 0 | (0, \infty) | ||
Gamma | shape | (0, \infty) | mle, bcmle, mme, mmue | |
scale | 1 | (0, \infty) | ||
Gamma | mean | (0, \infty) | mle, bcmle, mme, mmue | |
(Alternative) | cv | 1 | (0, \infty) | |
Generalized | location | 0 | (-\infty, \infty) | mle, pwme, tsoe |
Extreme | scale | 1 | (0, \infty) | |
Value | shape | 0 | (-\infty, \infty) | |
Geometric | prob | (0, 1) | mle/mme, mvue | |
Hypergeometric | m | [0, \infty) | mle, mvue | |
n | [0, \infty) | |||
k | [1, m+n] | |||
Logistic | location | 0 | (-\infty, \infty) | mle, mme, mmue |
scale | 1 | (0, \infty) | ||
Lognormal | meanlog | 0 | (-\infty, \infty) | mle/mme, mvue |
sdlog | 1 | (0, \infty) | ||
Lognormal | mean | exp(1/2) | (0, \infty) | mle, mme, mmue, |
(Alternative) | cv | sqrt(exp(1)-1) | (0, \infty) | mvue, qmle |
Lognormal | meanlog1 | 0 | (-\infty, \infty) | |
Mixture | sdlog1 | 1 | (0, \infty) | |
meanlog2 | 0 | (-\infty, \infty) | ||
sdlog2 | 1 | (0, \infty) | ||
p.mix | 0.5 | [0, 1] | ||
Lognormal | mean1 | exp(1/2) | (0, \infty) | |
Mixture | cv1 | sqrt(exp(1)-1) | (0, \infty) | |
(Alternative) | mean2 | exp(1/2) | (0, \infty) | |
cv2 | sqrt(exp(1)-1) | (0, \infty) | ||
p.mix | 0.5 | [0, 1] | ||
Three- | meanlog | 0 | (-\infty, \infty) | lmle, mme, |
Parameter | sdlog | 1 | (0, \infty) | mmue, mmme, |
Lognormal | threshold | 0 | (-\infty, \infty) | royston.skew, |
zero.skew | ||||
Truncated | meanlog | 0 | (-\infty, \infty) | |
Lognormal | sdlog | 1 | (0, \infty) | |
min | 0 | [0, max) | ||
max | Inf | (min, \infty) | ||
Truncated | mean | exp(1/2) | (0, \infty) | |
Lognormal | cv | sqrt(exp(1)-1) | (0, \infty) | |
(Alternative) | min | 0 | [0, max) | |
max | Inf | (min, \infty) | ||
Negative | size | [1, \infty) | mle/mme, mvue | |
Binomial | prob | (0, 1] | ||
mu | (0, \infty) | |||
Normal | mean | 0 | (-\infty, \infty) | mle/mme, mvue |
sd | 1 | (0, \infty) | ||
Normal | mean1 | 0 | (-\infty, \infty) | |
Mixture | sd1 | 1 | (0, \infty) | |
mean2 | 0 | (-\infty, \infty) | ||
sd2 | 1 | (0, \infty) | ||
p.mix | 0.5 | [0, 1] | ||
Truncated | mean | 0 | (-\infty, \infty) | |
Normal | sd | 1 | (0, \infty) | |
min | -Inf | (-\infty, max) | ||
max | Inf | (min, \infty) | ||
Pareto | location | (0, \infty) | lse, mle | |
shape | 1 | (0, \infty) | ||
Poisson | lambda | (0, \infty) | mle/mme/mvue | |
Student's t | df | (0, \infty) | ||
ncp | 0 | (-\infty, \infty) | ||
Triangular | min | 0 | (-\infty, max) | |
max | 1 | (min, \infty) | ||
mode | 0.5 | (min, max) | ||
Uniform | min | 0 | (-\infty, max) | mle, mme, mmue |
max | 1 | (min, \infty) | ||
Weibull | shape | (0, \infty) | mle, mme, mmue | |
scale | 1 | (0, \infty) | ||
Wilcoxon | m | [1, \infty) | ||
Rank Sum | n | [1, \infty) | ||
Zero-Modified | meanlog | 0 | (-\infty, \infty) | mvue |
Lognormal | sdlog | 1 | (0, \infty) | |
(Delta) | p.zero | 0.5 | [0, 1] | |
Zero-Modified | mean | exp(1/2) | (0, \infty) | mvue |
Lognormal | cv | sqrt(exp(1)-1) | (0, \infty) | |
(Delta) | p.zero | 0.5 | [0, 1] | |
(Alternative) | ||||
Zero-Modified | mean | 0 | (-\infty, \infty) | mvue |
Normal | sd | 1 | (0, \infty) | |
p.zero | 0.5 | [0, 1] | ||
Source
The EnvStats package.
References
Millard, S.P. (2013). EnvStats: An R Package for Environmental Statistics. Springer, New York. https://link.springer.com/book/10.1007/978-1-4614-8456-1.