dmixnorm {KScorrect} | R Documentation |
The Normal Mixture Distribution
Description
Density, distribution function, quantile function, and random generation for
a univariate (one-dimensional) distribution composed of a mixture of normal
distributions with means equal to mean
, standard deviations equal to
sd
, and mixing proportion of the components equal to pro
.
Usage
dmixnorm(x, mean, sd, pro)
pmixnorm(q, mean, sd, pro)
qmixnorm(p, mean, sd, pro, expand = 1)
rmixnorm(n, mean, sd, pro)
Arguments
x |
Vector of quantiles. |
mean |
Vector of means, one for each component. |
sd |
Vector of standard deviations, one for each component. If a single value is provided, an equal-variance mixture model is implemented. Must be non-negative. |
pro |
Vector of mixing proportions, one for each component. If missing, an equal-proportion model is implemented, with a warning. If proportions do not sum to unity, they are rescaled to do so. Must be non-negative. |
q |
Vector of quantiles. |
p |
Vector of probabilities. |
expand |
Value to expand the range of probabilities for quantile
approximation. |
n |
Number of observations. |
Details
These functions use, modify, and wrap around those from the
mclust
package, especially dens
, and
sim
. Functions are slightly faster than the
corresponding mclust
functions when used with univariate
distributions.
Unlike mclust
, which primarily focuses on parameter estimation based
on mixture samples, the functions here are modified to calculate PDFs,
CDFs, approximate quantiles, and random numbers for mixture distributions
with user-specified parameters. The functions are written to emulate the
syntax of other R distribution functions (e.g.,
Normal
).
The number of mixture components (argument G
in mclust
) is
specified from the length of the mean
vector. If a single sd
value is provided, an equal-variance mixture model (modelNames="E"
in mclust
) is implemented; if multiple values are provided, a
variable-variance model (modelNames="V"
in mclust
) is
implemented. If mixing proportion pro
is missing, all components are
assigned equal mixing proportions, with a warning. Mixing proportions are
rescaled to sum to unity. If the lengths of supplied means, standard
deviations, and mixing proportions conflict, an error is called.
Analytical solutions are not available to calculate a quantile function for
all combinations of mixture parameters. qmixnorm
approximates the
quantile function using a spline function calculated from cumulative
density functions for the specified mixture distribution. Quantile values
for probabilities near zero and one are approximated by taking a randomly
generated sample (with sample size equal to the product of 1000 and the
number of mixture components), and expanding that range positively and
negatively by a multiple (specified by (default) expand = 1
) of the
observed range in the random sample. In cases where the distribution range
is large (such as when mixture components are discrete or there are large
distances between components), resulting extreme probability values will be
very close to zero or one and can result in non-calculable (NaN
)
quantiles (and a warning). Use of other expand
values (especially
expand < 1.0
that expand the ranges by smaller multiples) often will
yield improved approximations. Note that expand
values equal to or
close to 0 may result in inaccurate approximation of extreme quantiles. In
situations requiring extreme quantile values, it is recommended that the
largest expand
value that does not result in a non-calculable
quantile (i.e., no warning called) be used. See examples
for
confirmation that approximations are accurate, comparing the approximate
quantiles from a single 'mixture' distribution to those calculated for the
same distribution using qnorm
, and demonstrating cases in which
using non-default expand
values will allow correct approximation of
quantiles.
Value
dmixnorm
gives the density, pmixnorm
gives the
distribution function, qmixnorm
approximates the quantile function,
and rmixnorm
generates random numbers.
Author(s)
Phil Novack-Gottshall pnovack-gottshall@ben.edu and Steve Wang scwang@swarthmore.edu, based on functions written by Luca Scrucca.
See Also
Distributions
for other standard distributions,
and mclust::dens
, sim
, and
cdfMclust
for alternative density, quantile, and
random number functions for multivariate mixture distributions.
Examples
# Mixture of two normal distributions
mean <- c(3, 6)
pro <- c(.25, .75)
sd <- c(.5, 1)
x <- rmixnorm(n=5000, mean=mean, pro=pro, sd=sd)
hist(x, n=20, main="random bimodal sample")
## Not run:
# Requires functions from the 'mclust' package
require(mclust)
# Confirm 'rmixnorm' above produced specified model
mod <- mclust::Mclust(x)
mod # Best model (correctly) has two-components with unequal variances
mod$parameters # and approximately same parameters as specified above
sd^2 # Note reports var (sigma-squared) instead of sd used above
## End(Not run)
# Density, distribution, and quantile functions
plot(seq(0, 10, .1), dmixnorm(seq(0, 10, .1), mean=mean, sd=sd, pro=pro),
type="l", main="Normal mixture density")
plot(seq(0, 10, .1), pmixnorm(seq(0, 10, .1), mean=mean, sd=sd, pro=pro),
type="l", main="Normal mixture cumulative")
plot(stats::ppoints(100), qmixnorm(stats::ppoints(100), mean=mean, sd=sd, pro=pro),
type="l", main="Normal mixture quantile")
# Any number of mixture components are allowed
plot(seq(0, 50, .01), pmixnorm(seq(0, 50, .01), mean=1:50, sd=.05, pro=rep(1, 50)),
type="l", main="50-component normal mixture cumulative")
# 'expand' can be specified to prevent non-calculable quantiles:
q1 <- qmixnorm(stats::ppoints(30), mean=c(1, 20), sd=c(1, 1), pro=c(1, 1))
q1 # Calls a warning because of NaNs
# Reduce 'expand'. (Values < 0.8 allow correct approximation)
q2 <- qmixnorm(stats::ppoints(30), mean=c(1, 20), sd=c(1, 1), pro=c(1, 1), expand=.5)
plot(stats::ppoints(30), q2, type="l", main="Quantile with reduced range")
## Not run:
# Requires functions from the 'mclust' package
# Confirmation that qmixnorm approximates correct solution
# (single component 'mixture' should mimic qnorm):
x <- rmixnorm(n=5000, mean=0, pro=1, sd=1)
mpar <- mclust::Mclust(x)$param
approx <- qmixnorm(p=ppoints(100), mean=mpar$mean, pro=mpar$pro,
sd=sqrt(mpar$variance$sigmasq))
known <- qnorm(p=ppoints(100), mean=mpar$mean, sd=sqrt(mpar$variance$sigmasq))
cor(approx, known) # Approximately the same
plot(approx, main="Quantiles for (unimodal) normal")
lines(known)
legend("topleft", legend=c("known", "approximation"), pch=c(NA,1),
lty=c(1, NA), bty="n")
## End(Not run)