dpareto {PtProcess} | R Documentation |
Pareto and Tapered Pareto Distributions
Description
Density, cumulative probability, quantiles and random number generation for the Pareto and tapered Pareto distributions with shape parameter \lambda
, tapering parameter \theta
and range a \le x < \infty
; and log-likelihood of the tapered Pareto distribution.
Usage
dpareto(x, lambda, a, log=FALSE)
ppareto(q, lambda, a, lower.tail=TRUE, log.p=FALSE)
qpareto(p, lambda, a, lower.tail=TRUE, log.p=FALSE)
rpareto(n, lambda, a)
dtappareto(x, lambda, theta, a, log=FALSE)
ltappareto(data, lambda, theta, a)
ptappareto(q, lambda, theta, a, lower.tail=TRUE, log.p=FALSE)
qtappareto(p, lambda, theta, a, lower.tail=TRUE, log.p=FALSE,
tol=1e-8)
rtappareto(n, lambda, theta, a)
ltappareto(data, lambda, theta, a)
Arguments
x , q |
vector of quantiles. |
p |
vector of probabilities. |
data |
vector of sample data. |
n |
number of observations to simulate. |
lambda |
shape parameter, see Details below. |
theta |
tapering parameter, see Details below.. |
a |
the random variable takes values on the interval |
log , log.p |
logical; if |
lower.tail |
logical; if |
tol |
convergence criteria for the Newton Raphson algorithm for solving the quantiles of the tapered Pareto distribution. |
Details
For all functions except ltappareto
, arguments lambda
and theta
can either be scalars or vectors of the same length as x
, p
, or q
. If a scalar, then this value is assumed to hold over all cases. If a vector, then the values are assumed to have a one to one relationship with the values in x
, p
, or q
. The argument a
is a scalar.
In the case of ltappareto
, all data
are assumed to be drawn from the same distribution and hence lambda
, theta
and a
are all scalars.
Let Y
be an exponential random variable with parameter \lambda > 0
. Then the distribution function of Y
is
F_Y(y) = \Pr\{Y < y \} = 1 - \exp(-\lambda y),
and the density function is
f_Y(y) = \lambda \exp(-\lambda y).
Further, the mean and variance of the distribution of Y
is 1/\lambda
and 1/\lambda^2
, respectively.
Now transform Y
as
X = a \exp(Y),
where a>0
. Then X
is a Pareto random variable with shape parameter \lambda
and distribution function
F_X(x) = \Pr\{X < x \} = 1 - \left( \frac{a}{x} \right)^\lambda,
where a \le x < \infty
, and density function
f_X(x) = \frac{\lambda}{a} \left( \frac{a}{x} \right)^{\lambda+1}.
We simulate the Pareto deviates by generating exponential deviates, and then transforming as described above.
As above, let X
be Pareto with shape parameter \lambda
, and define W - a
to be exponential with parameter 1/\theta
, i.e.
\Pr\{X > x\} = \left( \frac{a}{x} \right)^\lambda
and
\Pr\{W > w\} = \exp\left( \frac{a - w}{\theta} \right),
where a \le w < \infty
. Say we sample one independent value from each of the distributions X
and W
, then
\Pr\{X > z\ \&\ W > z\} = \Pr\{X > z\} \Pr\{ W > z\} = \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
We say that Z
has a tapered Pareto distribution if it has the above distribution, i.e.
F_Z(z) = \Pr\{Z < z\} = 1- \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
The above relationship shows that a tapered Pareto deviate can be simulated by generating independent values of X
and W
, and then letting Z = \min(X, W)
. This minimum has the effect of “tapering” the tail of the Pareto distribution.
The tapered Pareto variable Z
has density
f_Z(z) = \left( \frac{\lambda}{z} + \frac{1}{\theta} \right) \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).
Given a sample of data z_1, z_2, \cdots, z_n
, we write the log-likelihood as
\log L = \sum_{i=1}^n \log f_Z(z_i).
Hence the gradients are calculated as
\frac{\partial \log L}{\partial \lambda} = \theta \sum_{i=1}^n \frac{1}{\lambda \theta + z_i} - \sum_{i=1}^n \log(z_i/a)
and
\frac{\partial \log L}{\partial \theta} = \frac{-1}{\theta} \sum_{i=1}^n \frac{z_i}{\lambda \theta + z_i} - \frac{1}{\theta^2} \sum_{i=1}^n (a - z_i).
Further, the Hessian is calculated using
\frac{\partial^2 \log L}{\partial \lambda^2} = -\theta^2 \sum_{i=1}^n \frac{1}{(\lambda \theta + z_i)^2},
\frac{\partial^2 \log L}{\partial \theta^2} = \frac{1}{\theta^2} \sum_{i=1}^n \frac{z_i(2\lambda\theta + z_i)}{(\lambda \theta + z_i)^2} - \frac{2}{\theta^3} \sum_{i=1}^n (a - z_i),
and
\frac{\partial^2 \log L}{\partial \theta \, \partial \lambda} = \frac{\partial^2 \log L}{\partial \lambda \, \partial \theta} = \sum_{i=1}^n \frac{z_i}{(\lambda \theta + z_i)^2}.
See the section “Seismological Context” (below), which outlines its application in Seismology.
Value
dpareto
and dtappareto
give the densities; ppareto
and ptappareto
give the distribution functions; qpareto
and qtappareto
give the quantile functions; and rpareto
and rtappareto
generate random deviates.
ltappareto
returns the log-likelihood of a sample using the tapered Pareto distribution. It also calculates, using analytic expressions (see “Details”), the derivatives and Hessian which are attached to the log-likelihood value as the attributes "gradient"
and "hessian"
, respectively.
Seismological Context
The Gutenberg-Richter (GR) Law says that if we plot the base 10 logarithm of the number of events with magnitude greater than M
(vertical axis) against M
(horizontal axis), there should be a straight line. This is equivalent to magnitudes having an exponential distribution.
Assume that the magnitude cutoff is M_0
, and let Y = M - M_0
. Given that Y
has an exponential distribution with parameter \lambda
, it follows that
\log_{10} \left( 1 - F_Y(y) \right) = \frac{-\lambda y}{\log_e 10}.
The coefficient \lambda/(\log_e 10)
is often referred to as the b
-value, and its negative value is the slope of the line in the GR plot.
Now define S
as
S = 10^{\gamma (M - M_0)} = 10^{\gamma Y}.
When \gamma = 0.75
, S
is the “stress”; and when \gamma = 1.5
, S
is the “seismic moment”. Still assuming that Y
is exponential with parameter \lambda
, then Y \gamma \log_e 10
is also exponential with parameter \lambda/(\gamma \log_e 10)
. Hence, by noting that S
can be rewritten as
S = \exp\{ Y \gamma \log_e 10 \},
it is seen that S
is Pareto with parameter \lambda/(\gamma \log_e 10)
, and 1 \le S < \infty
.
While the empirical distribution of magnitudes appears to follow an exponential distribution for smaller events, it provides a poor approximation for larger events. This is because it is not physically possible to have events with magnitudes much greater than about 9.5. Consequently, the tail of the Pareto distribution will also be too long. Hence the tapered Pareto distribution provides a more realistic description.
See Also
See dexp
for the exponential distribution. Generalisations of the exponential distribution are the gamma distribution dgamma
and the Weibull distribution dweibull
.
See the topic distribution
for examples of estimating parameters.
Examples
# Simulate and plot histogram with density for Pareto Distribution
a0 <- 2
lambda0 <- 2
x <- rpareto(1000, lambda=lambda0, a=a0)
x0 <- seq(a0, max(x)+0.1, length=100)
hist(x, freq=FALSE, breaks=x0, xlim=range(x0),
main="Pareto Distribution")
points(x0, dpareto(x0, lambda0, a0), type="l", col="red")
#-----------------------------------------------
# Calculate probabilities and quantiles for Pareto Distribution
a0 <- 2
lambda0 <- 2
prob <- ppareto(seq(a0, 8), lambda0, a0)
quan <- qpareto(prob, lambda0, a0)
print(quan)
#-----------------------------------------------
# Simulate and plot histogram with density for tapered Pareto Distribution
a0 <- 2
lambda0 <- 2
theta0 <- 3
x <- rtappareto(1000, lambda=lambda0, theta=theta0, a=a0)
x0 <- seq(a0, max(x)+0.1, length=100)
hist(x, freq=FALSE, breaks=x0, xlim=range(x0),
main="Tapered Pareto Distribution")
points(x0, dtappareto(x0, lambda0, theta0, a0), type="l", col="red")
#-----------------------------------------------
# Calculate probabilities and quantiles for tapered Pareto Distribution
a0 <- 2
lambda0 <- 2
theta0 <- 3
prob <- ptappareto(seq(a0, 8), lambda0, theta0, a0)
quan <- qtappareto(prob, lambda0, theta0, a0)
print(quan)
#-----------------------------------------------
# Calculate log-likelihood for tapered Pareto Distribution
# note the Hessian and gradient attributes
a0 <- 2
lambda0 <- 2
theta0 <- 3
x <- rtappareto(1000, lambda=lambda0, theta=theta0, a=a0)
LL <- ltappareto(x, lambda=lambda0, theta=theta0, a=a0)
print(LL)