ezmnorm {EnvStats}  R Documentation 
Estimate the mean and standard deviation of a zeromodified normal distribution, and optionally construct a confidence interval for the mean.
ezmnorm(x, method = "mvue", ci = FALSE, ci.type = "twosided",
ci.method = "normal.approx", conf.level = 0.95)
x 
numeric vector of observations. 
method 
character string specifying the method of estimation. Currently, the only possible
value is 
ci 
logical scalar indicating whether to compute a confidence interval for the
mean. The default value is 
ci.type 
character string indicating what kind of confidence interval to compute. The
possible values are 
ci.method 
character string indicating what method to use to construct the confidence interval
for the mean. Currently the only possible value is 
conf.level 
a scalar between 0 and 1 indicating the confidence level of the confidence interval.
The default value is 
If x
contains any missing (NA
), undefined (NaN
) or
infinite (Inf
, Inf
) values, they will be removed prior to
performing the estimation.
Let \underline{x} = (x_1, x_2, \ldots, x_n)
be a vector of
n
observations from a
zeromodified normal distribution with
parameters mean=
\mu
, sd=
\sigma
, and p.zero=
p
.
Let r
denote the number of observations in \underline{x}
that are equal
to 0, and order the observations so that x_1, x_2, \ldots, x_r
denote
the r
zero observations, and x_{r+1}, x_{r+2}, \ldots, x_n
denote the
nr
nonzero observations.
Note that \mu
is not the mean of the zeromodified normal distribution;
it is the mean of the normal part of the distribution. Similarly, \sigma
is
not the standard deviation of the zeromodified normal distribution; it is
the standard deviation of the normal part of the distribution.
Let \gamma
and \delta
denote the mean and standard deviation of the
overall zeromodified normal distribution. Aitchison (1955) shows that:
\gamma = (1  p) \mu \;\;\;\; (1)
\delta^2 = (1  p) \sigma^2 + p (1  p) \mu^2 \;\;\;\; (2)
Estimation
Minimum Variance Unbiased Estimation (method="mvue"
)
Aitchison (1955) shows that the minimum variance unbiased estimators (mvue's) of
\gamma
and \delta
are:
\hat{\gamma}_{mvue} = \bar{x} \;\;\;\; (3)
\hat{\delta}^2_{mvue} =  \frac{nr1}{n1} (s^*)^2 + \frac{r}{n} (\frac{nr}{n1}) (\bar{x}^*)^2  if r < n  1 , 
x_n^2 / n  if r = n  1 , 

0  if r = n \;\;\;\; (4)

where
\bar{x} = \frac{1}{n} \sum_{i=1}^n x_i \;\;\;\; (5)
\bar{x}^* = \frac{1}{nr} \sum_{i=r+1}^n x_i \;\;\;\; (6)
(s^*)^2 = \frac{1}{nr1} \sum_{i=r+1}^n (x_i  \bar{x}^*)^2 \;\;\;\; (7)
Note that the quantity in equation (5) is the sample mean of all observations
(including 0 values), the quantity in equation (6) is the sample mean of all nonzero
observations, and the quantity in equation (7) is the sample variance of all
nonzero observations. Also note that for r=n1
or r=n
, the estimator
of \delta^2
is the sample variance for all observations (including 0 values).
Confidence Intervals
Based on Normal Approximation (ci.method="normal.approx"
)
An approximate (1\alpha)100\%
confidence interval for \gamma
is
constructed based on the assumption that the estimator of \gamma
is
approximately normally distributed. Aitchison (1955) shows that
Var(\hat{\gamma}_{mvue}) = Var(\bar{x}) = \frac{\delta^2}{n} \;\;\;\; (8)
Thus, an approximate twosided (1\alpha)100\%
confidence interval for
\gamma
is constructed as:
[ \hat{\gamma}_{mvue}  t_{n2, 1\alpha/2} \frac{\hat{\delta}_{mvue}}{\sqrt{n}}, \; \hat{\gamma}_{mvue} + t_{n2, 1\alpha/2} \frac{\hat{\delta}_{mvue}}{\sqrt{n}} ] \;\;\;\; (9)
where t_{\nu, p}
is the p
'th quantile of
Student's tdistribution with \nu
degrees of freedom.
Onesided confidence intervals are computed in a similar fashion.
a list of class "estimate"
containing the estimated parameters and other information.
See
estimate.object
for details.
The component called parameters
is a numeric vector with the following
estimated parameters:
Parameter Name  Explanation 
mean  mean of the normal (Gaussian) part of the distribution. 
sd  standard deviation of the normal (Gaussian) part of the distribution. 
p.zero  probability that an observation will be 0. 
mean.zmnorm  mean of the overall zeromodified normal distribution. 
sd.zmnorm  standard deviation of the overall normal distribution. 
The zeromodified normal distribution is sometimes used to model chemical concentrations for which some observations are reported as “Below Detection Limit”. See, for example USEPA (1992c, pp.2734). In most cases, however, the zeromodified lognormal (delta) distribution will be more appropriate, since chemical concentrations are bounded below at 0 (e.g., Gilliom and Helsel, 1986; Owen and DeRouen, 1980).
Once you estimate the parameters of the zeromodified normal distribution, it is often useful to characterize the uncertainty in the estimate of the mean. This is done with a confidence interval.
One way to try to assess whether a
zeromodified lognormal (delta),
zeromodified normal, censored normal, or
censored lognormal is the best model for the data is to construct both
censored and detectsonly probability plots (see qqPlotCensored
).
Steven P. Millard (EnvStats@ProbStatInfo.com)
Aitchison, J. (1955). On the Distribution of a Positive Random Variable Having a Discrete Probability Mass at the Origin. Journal of the American Statistical Association 50, 901–908.
Gilliom, R.J., and D.R. Helsel. (1986). Estimation of Distributional Parameters for Censored Trace Level Water Quality Data: 1. Estimation Techniques. Water Resources Research 22, 135–146.
Owen, W., and T. DeRouen. (1980). Estimation of the Mean for Lognormal Data Containing Zeros and LeftCensored Values, with Applications to the Measurement of Worker Exposure to Air Contaminants. Biometrics 36, 707–719.
USEPA (1992c). Statistical Analysis of GroundWater Monitoring Data at RCRA Facilities: Addendum to Interim Final Guidance. Office of Solid Waste, Permits and State Programs Division, US Environmental Protection Agency, Washington, D.C.
ZeroModifiedNormal, Normal,
ezmlnorm
, ZeroModifiedLognormal, estimate.object
.
# Generate 100 observations from a zeromodified normal distribution
# with mean=4, sd=2, and p.zero=0.5, then estimate the parameters.
# According to equations (1) and (2) above, the overall mean is
# mean.zmnorm=2 and the overall standard deviation is sd.zmnorm=sqrt(6).
# (Note: the call to set.seed simply allows you to reproduce this example.)
set.seed(250)
dat < rzmnorm(100, mean = 4, sd = 2, p.zero = 0.5)
ezmnorm(dat, ci = TRUE)
#Results of Distribution Parameter Estimation
#
#
#Assumed Distribution: ZeroModified Normal
#
#Estimated Parameter(s): mean = 4.037732
# sd = 1.917004
# p.zero = 0.450000
# mean.zmnorm = 2.220753
# sd.zmnorm = 2.465829
#
#Estimation Method: mvue
#
#Data: dat
#
#Sample Size: 100
#
#Confidence Interval for: mean.zmnorm
#
#Confidence Interval Method: Normal Approximation
# (t Distribution)
#
#Confidence Interval Type: twosided
#
#Confidence Level: 95%
#
#Confidence Interval: LCL = 1.731417
# UCL = 2.710088
#
# Following Example 9 on page 34 of USEPA (1992c), compute an
# estimate of the mean of the zinc data, assuming a
# zeromodified normal distribution. The data are stored in
# EPA.92c.zinc.df.
head(EPA.92c.zinc.df)
# Zinc.orig Zinc Censored Sample Well
#1 <7 7.00 TRUE 1 1
#2 11.41 11.41 FALSE 2 1
#3 <7 7.00 TRUE 3 1
#4 <7 7.00 TRUE 4 1
#5 <7 7.00 TRUE 5 1
#6 10.00 10.00 FALSE 6 1
New.Zinc < EPA.92c.zinc.df$Zinc
New.Zinc[EPA.92c.zinc.df$Censored] < 0
ezmnorm(New.Zinc, ci = TRUE)
#Results of Distribution Parameter Estimation
#
#
#Assumed Distribution: ZeroModified Normal
#
#Estimated Parameter(s): mean = 11.891000
# sd = 1.594523
# p.zero = 0.500000
# mean.zmnorm = 5.945500
# sd.zmnorm = 6.123235
#
#Estimation Method: mvue
#
#Data: New.Zinc
#
#Sample Size: 40
#
#Confidence Interval for: mean.zmnorm
#
#Confidence Interval Method: Normal Approximation
# (t Distribution)
#
#Confidence Interval Type: twosided
#
#Confidence Level: 95%
#
#Confidence Interval: LCL = 3.985545
# UCL = 7.905455
#
# Clean up
rm(dat, New.Zinc)