dpareto {PtProcess}R Documentation

Pareto and Tapered Pareto Distributions

Description

Density, cumulative probability, quantiles and random number generation for the Pareto and tapered Pareto distributions with shape parameter λ\lambda, tapering parameter θ\theta and range ax<a \le x < \infty; and log-likelihood of the tapered Pareto distribution.

Usage

dpareto(x, lambda, a, log=FALSE)
ppareto(q, lambda, a, lower.tail=TRUE, log.p=FALSE)
qpareto(p, lambda, a, lower.tail=TRUE, log.p=FALSE)
rpareto(n, lambda, a)

dtappareto(x, lambda, theta, a, log=FALSE)
ltappareto(data, lambda, theta, a)
ptappareto(q, lambda, theta, a, lower.tail=TRUE, log.p=FALSE)
qtappareto(p, lambda, theta, a, lower.tail=TRUE, log.p=FALSE,
           tol=1e-8)
rtappareto(n, lambda, theta, a)

ltappareto(data, lambda, theta, a)

Arguments

x, q

vector of quantiles.

p

vector of probabilities.

data

vector of sample data.

n

number of observations to simulate.

lambda

shape parameter, see Details below.

theta

tapering parameter, see Details below..

a

the random variable takes values on the interval ax<a \le x < \infty. This is a scalar and is assumed to be a constant for all values in a given function call.

log, log.p

logical; if TRUE, probabilities p are given as log(p).

lower.tail

logical; if TRUE (default), probabilities are Pr{Xx}\Pr\{X \le x\}, otherwise, Pr{X>x}\Pr\{X > x\}.

tol

convergence criteria for the Newton Raphson algorithm for solving the quantiles of the tapered Pareto distribution.

Details

For all functions except ltappareto, arguments lambda and theta can either be scalars or vectors of the same length as x, p, or q. If a scalar, then this value is assumed to hold over all cases. If a vector, then the values are assumed to have a one to one relationship with the values in x, p, or q. The argument a is a scalar.

In the case of ltappareto, all data are assumed to be drawn from the same distribution and hence lambda, theta and a are all scalars.

Let YY be an exponential random variable with parameter λ>0\lambda > 0. Then the distribution function of YY is

FY(y)=Pr{Y<y}=1exp(λy), F_Y(y) = \Pr\{Y < y \} = 1 - \exp(-\lambda y),

and the density function is

fY(y)=λexp(λy). f_Y(y) = \lambda \exp(-\lambda y).

Further, the mean and variance of the distribution of YY is 1/λ1/\lambda and 1/λ21/\lambda^2, respectively.

Now transform YY as

X=aexp(Y), X = a \exp(Y),

where a>0a>0. Then XX is a Pareto random variable with shape parameter λ\lambda and distribution function

FX(x)=Pr{X<x}=1(ax)λ, F_X(x) = \Pr\{X < x \} = 1 - \left( \frac{a}{x} \right)^\lambda,

where ax<a \le x < \infty, and density function

fX(x)=λa(ax)λ+1. f_X(x) = \frac{\lambda}{a} \left( \frac{a}{x} \right)^{\lambda+1}.

We simulate the Pareto deviates by generating exponential deviates, and then transforming as described above.

As above, let XX be Pareto with shape parameter λ\lambda, and define WaW - a to be exponential with parameter 1/θ1/\theta, i.e.

Pr{X>x}=(ax)λ \Pr\{X > x\} = \left( \frac{a}{x} \right)^\lambda

and

Pr{W>w}=exp(awθ), \Pr\{W > w\} = \exp\left( \frac{a - w}{\theta} \right),

where aw<a \le w < \infty. Say we sample one independent value from each of the distributions XX and WW, then

Pr{X>z & W>z}=Pr{X>z}Pr{W>z}=(az)λexp(azθ). \Pr\{X > z\ \&\ W > z\} = \Pr\{X > z\} \Pr\{ W > z\} = \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).

We say that ZZ has a tapered Pareto distribution if it has the above distribution, i.e.

FZ(z)=Pr{Z<z}=1(az)λexp(azθ). F_Z(z) = \Pr\{Z < z\} = 1- \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).

The above relationship shows that a tapered Pareto deviate can be simulated by generating independent values of XX and WW, and then letting Z=min(X,W)Z = \min(X, W). This minimum has the effect of “tapering” the tail of the Pareto distribution.

The tapered Pareto variable ZZ has density

fZ(z)=(λz+1θ)(az)λexp(azθ). f_Z(z) = \left( \frac{\lambda}{z} + \frac{1}{\theta} \right) \left( \frac{a}{z} \right)^\lambda \exp\left( \frac{a - z}{\theta} \right).

Given a sample of data z1,z2,,znz_1, z_2, \cdots, z_n, we write the log-likelihood as

logL=i=1nlogfZ(zi). \log L = \sum_{i=1}^n \log f_Z(z_i).

Hence the gradients are calculated as

logLλ=θi=1n1λθ+zii=1nlog(zi/a) \frac{\partial \log L}{\partial \lambda} = \theta \sum_{i=1}^n \frac{1}{\lambda \theta + z_i} - \sum_{i=1}^n \log(z_i/a)

and

logLθ=1θi=1nziλθ+zi1θ2i=1n(azi). \frac{\partial \log L}{\partial \theta} = \frac{-1}{\theta} \sum_{i=1}^n \frac{z_i}{\lambda \theta + z_i} - \frac{1}{\theta^2} \sum_{i=1}^n (a - z_i).

Further, the Hessian is calculated using

2logLλ2=θ2i=1n1(λθ+zi)2, \frac{\partial^2 \log L}{\partial \lambda^2} = -\theta^2 \sum_{i=1}^n \frac{1}{(\lambda \theta + z_i)^2},

2logLθ2=1θ2i=1nzi(2λθ+zi)(λθ+zi)22θ3i=1n(azi), \frac{\partial^2 \log L}{\partial \theta^2} = \frac{1}{\theta^2} \sum_{i=1}^n \frac{z_i(2\lambda\theta + z_i)}{(\lambda \theta + z_i)^2} - \frac{2}{\theta^3} \sum_{i=1}^n (a - z_i),

and

2logLθλ=2logLλθ=i=1nzi(λθ+zi)2. \frac{\partial^2 \log L}{\partial \theta \, \partial \lambda} = \frac{\partial^2 \log L}{\partial \lambda \, \partial \theta} = \sum_{i=1}^n \frac{z_i}{(\lambda \theta + z_i)^2}.

See the section “Seismological Context” (below), which outlines its application in Seismology.

Value

dpareto and dtappareto give the densities; ppareto and ptappareto give the distribution functions; qpareto and qtappareto give the quantile functions; and rpareto and rtappareto generate random deviates.

ltappareto returns the log-likelihood of a sample using the tapered Pareto distribution. It also calculates, using analytic expressions (see “Details”), the derivatives and Hessian which are attached to the log-likelihood value as the attributes "gradient" and "hessian", respectively.

Seismological Context

The Gutenberg-Richter (GR) Law says that if we plot the base 10 logarithm of the number of events with magnitude greater than MM (vertical axis) against MM (horizontal axis), there should be a straight line. This is equivalent to magnitudes having an exponential distribution.

Assume that the magnitude cutoff is M0M_0, and let Y=MM0Y = M - M_0. Given that YY has an exponential distribution with parameter λ\lambda, it follows that

log10(1FY(y))=λyloge10. \log_{10} \left( 1 - F_Y(y) \right) = \frac{-\lambda y}{\log_e 10}.

The coefficient λ/(loge10)\lambda/(\log_e 10) is often referred to as the bb-value, and its negative value is the slope of the line in the GR plot.

Now define SS as

S=10γ(MM0)=10γY. S = 10^{\gamma (M - M_0)} = 10^{\gamma Y}.

When γ=0.75\gamma = 0.75, SS is the “stress”; and when γ=1.5\gamma = 1.5, SS is the “seismic moment”. Still assuming that YY is exponential with parameter λ\lambda, then Yγloge10Y \gamma \log_e 10 is also exponential with parameter λ/(γloge10)\lambda/(\gamma \log_e 10). Hence, by noting that SS can be rewritten as

S=exp{Yγloge10}, S = \exp\{ Y \gamma \log_e 10 \},

it is seen that SS is Pareto with parameter λ/(γloge10)\lambda/(\gamma \log_e 10), and 1S<1 \le S < \infty.

While the empirical distribution of magnitudes appears to follow an exponential distribution for smaller events, it provides a poor approximation for larger events. This is because it is not physically possible to have events with magnitudes much greater than about 9.5. Consequently, the tail of the Pareto distribution will also be too long. Hence the tapered Pareto distribution provides a more realistic description.

See Also

See dexp for the exponential distribution. Generalisations of the exponential distribution are the gamma distribution dgamma and the Weibull distribution dweibull.

See the topic distribution for examples of estimating parameters.

Examples

#    Simulate and plot histogram with density for Pareto Distribution

a0 <- 2
lambda0 <- 2
x <- rpareto(1000, lambda=lambda0, a=a0)
x0 <- seq(a0, max(x)+0.1, length=100)
hist(x, freq=FALSE, breaks=x0, xlim=range(x0),
     main="Pareto Distribution")
points(x0, dpareto(x0, lambda0, a0), type="l", col="red")

#-----------------------------------------------
#    Calculate probabilities and quantiles for Pareto Distribution

a0 <- 2
lambda0 <- 2
prob <- ppareto(seq(a0, 8), lambda0, a0)
quan <- qpareto(prob, lambda0, a0)
print(quan)

#-----------------------------------------------
#    Simulate and plot histogram with density for tapered Pareto Distribution

a0 <- 2
lambda0 <- 2
theta0 <- 3
x <- rtappareto(1000, lambda=lambda0, theta=theta0, a=a0)
x0 <- seq(a0, max(x)+0.1, length=100)
hist(x, freq=FALSE, breaks=x0, xlim=range(x0),
     main="Tapered Pareto Distribution")
points(x0, dtappareto(x0, lambda0, theta0, a0), type="l", col="red")

#-----------------------------------------------
#    Calculate probabilities and quantiles for tapered Pareto Distribution

a0 <- 2
lambda0 <- 2
theta0 <- 3
prob <- ptappareto(seq(a0, 8), lambda0, theta0, a0)
quan <- qtappareto(prob, lambda0, theta0, a0)
print(quan)

#-----------------------------------------------
#    Calculate log-likelihood for tapered Pareto Distribution
#    note the Hessian and gradient attributes

a0 <- 2
lambda0 <- 2
theta0 <- 3
x <- rtappareto(1000, lambda=lambda0, theta=theta0, a=a0)
LL <- ltappareto(x, lambda=lambda0, theta=theta0, a=a0)
print(LL)


[Package PtProcess version 3.3-16 Index]