families {aster2}R Documentation

Families for Aster Models

Description

Families known to the package. These functions construct simple family specifications used in specifying aster models. Statistical properties of these families are described.

Usage

fam.bernoulli()
fam.poisson()
fam.zero.truncated.poisson()
fam.normal.location.scale()
fam.multinomial(dimension)

Arguments

dimension

the dimension (number of categories) for the multinomial distribution.

Details

Currently implemented families are

"bernoulli"

Bernoulli (binomial with sample size one). The distribution of any zero-or-one-valued random variable YY, which is the canonical statistic. The mean value parameter is

μ=E(Y)=Pr(Y=1).\mu = E(Y) = \Pr(Y = 1).

The canonical parameter is θ=log(μ)log(1μ)\theta = \log(\mu) - \log(1 - \mu), also called logit of μ\mu. The cumulant function is

c(θ)=log(1+eθ).c(\theta) = \log(1 + e^\theta).

This distribution has degenerate limiting distributions. The lower limit as θ\theta \to - \infty is the distribution concentrated at zero, having cumulant function which is the constant function everywhere equal to zero. The upper limit as θ+\theta \to + \infty is the distribution concentrated at one, having cumulant function which is the identity function satisfying c(θ)=θc(\theta) = \theta for all θ\theta.

For predecessor (sample size) nn, the successor is the sum of nn independent and identically distributed (IID) Bernoulli random variables, that is, binomial with sample size nn. The mean value parameter is nn times the mean value parameter for sample size one; the cumulant function is nn times the cumulant function for sample size one; the canonical parameter is the same for all sample sizes.

"poisson"

Poisson. The mean value parameter μ\mu is the mean of the Poisson distribution. The canonical parameter is θ=log(μ)\theta = \log(\mu). The cumulant function is

c(θ)=eθ.c(\theta) = e^\theta.

This distribution has a degenerate limiting distribution. The lower limit as θ\theta \to - \infty is the distribution concentrated at zero, having cumulant function which is the constant function everywhere equal to zero. There is no upper limit because the canonical statistic is unbounded above.

For predecessor (sample size) nn, the successor is the sum of nn IID Poisson random variables, that is, Poisson with mean nμn \mu. The mean value parameter is nn times the mean value parameter for sample size one; the cumulant function is nn times the cumulant function for sample size one; the canonical parameter is the same for all sample sizes.

"zero.truncated.poisson"

Poisson conditioned on being greater than zero. Let mm be the mean of the corresponding untruncated Poisson distribution. Then the canonical parameters for both truncated and untruncated distributions are the same θ=log(m)\theta = \log(m). The mean value parameter for the zero-truncated Poisson distribution is

μ=m1em\mu = \frac{m}{1 - e^{- m}}

and the cumulant function is

c(θ)=m+log(1em),c(\theta) = m + \log(1 - e^{- m}),

where mm is as defined above, so m=eθm = e^\theta.

This distribution has a degenerate limiting distribution. The lower limit as θ\theta \to - \infty is the distribution concentrated at one, having cumulant function which is the identity function satisfying c(θ)=θc(\theta) = \theta for all θ\theta. There is no upper limit because the canonical statistic is unbounded above.

For predecessor (sample size) nn, the successor is the sum of nn IID zero-truncated Poisson random variables, which is not a brand-name distribution. The mean value parameter is nn times the mean value parameter for sample size one; the cumulant function is nn times the cumulant function for sample size one; the canonical parameter is the same for all sample sizes.

"normal.location.scale"

The distribution of a normal random variable XX with unknown mean mm and unknown variance vv. Thought of as an exponential family, this is a two-parameter family, hence must have a two-dimensional canonical statistic Y=(X,X2)Y = (X, X^2). The canonical parameter vector θ\theta has components

θ1=mv\theta_1 = \frac{m}{v}

and

θ2=12v.\theta_2 = - \frac{1}{2 v}.

The value of θ1\theta_1 is unrestricted, but θ2\theta_2 must be strictly negative. The mean value parameter vector μ\mu has components

μ1=m=θ12θ2\mu_1 = m = - \frac{\theta_1}{2 \theta_2}

and

μ2=v+m2=12θ2+θ124θ22.\mu_2 = v + m^2 = - \frac{1}{2 \theta_2} + \frac{\theta_1^2}{4 \theta_2^2}.

The cumulant function is

c(θ)=θ124θ2+12log(12θ2).c(\theta) = - \frac{\theta_1^2}{4 \theta_2} + \frac{1}{2} \log\left(- \frac{1}{2 \theta_2}\right).

This distribution has no degenerate limiting distributions, because the canonical statistic is a continuous random vector so the boundary of its support has probability zero.

For predecessor (sample size) nn, the successor is the sum of nn IID random vectors (Xi,Xi2)(X_i, X_i^2), where each XiX_i is normal with mean mm and variance vv, and this is not a brand-name multivariate distribution (the first component of the sum is normal, the second component noncentral chi-square, and the components are not independent). The mean value parameter vector is nn times the mean value parameter vector for sample size one; the cumulant function is nn times the cumulant function for sample size one; the canonical parameter vector is the same for all sample sizes.

"multinomial"

Multinomial with sample size one. The distribution of any random vector YY having all components zero except for one component which is one (YY is the canonical statistic vector). The mean value parameter is the vector μ=E(Y)\mu = E(Y) having components

μi=E(Yi)=Pr(Yi=1).\mu_i = E(Y_i) = \Pr(Y_i = 1).

The mean value parameter vector μ\mu is given as a function of the canonical parameter vector θ\theta by

μi=eθij=1deθj,\mu_i = \frac{e^{\theta_i}}{\sum_{j = 1}^d e^{\theta_j}},

where dd is the dimension of YY and θ\theta and μ\mu. This transformation is not one-to-one; adding the same number to each component of θ\theta does not change the value of μ\mu. The cumulant function is

c(θ)=log(j=1deθj).c(\theta) = \log\left(\sum_{j = 1}^d e^{\theta_j}\right).

This distribution is degenerate. The sum of the components of the canonical statistic is equal to one with probability one, which implies the nonidentifiability of the dd-dimensional canonical parameter vector mentioned above. Hence one parameter (at least) is always constrained to to be zero in fitting an aster model with a multinomial family.

This distribution has many degenerate distributions. For any vector δ\delta the limit of distributions having canonical parameter vectors θ+sδ\theta + s \delta as ss \to \infty exists and is another multinomial distribution (the limit distribution in the direction δ\delta). Let AA be the set of ii such that δi=max(δ)\delta_i = \max(\delta), where max(δ)\max(\delta) denotes the maximum over the components of δ\delta. Then the limit distribution in the direction δ\delta has components YiY_i of the canonical statistic for iAi \notin A concentrated at zero. The cumulant function of this degenerate distribution is

c(θ)=log(jAeθj).c(\theta) = \log\left(\sum_{j \in A} e^{\theta_j}\right).

The canonical parameters θj\theta_j for jAj \notin A are not identifiable, and one other canonical parameter is not identifiable because of the constraint that the sum of the components of the canonical statistic is equal to one with probability one.

For predecessor (sample size) nn, the successor is the sum of nn IID multinomial-sample-size-one random vectors, that is, multinomial with sample size nn. The mean value parameter is nn times the mean value parameter for sample size one; the cumulant function is nn times the cumulant function for sample size one; the canonical parameter is the same for all sample sizes.

Value

a list of class "astfam" giving name and values of any hyperparameters.

Examples

fam.bernoulli()
fam.multinomial(4)

[Package aster2 version 0.3 Index]