prodist {distributions3} | R Documentation |

Generic function with methods for various model classes for extracting
fitted (in-sample) or predicted (out-of-sample) probability `distributions3`

objects.

```
prodist(object, ...)
## S3 method for class 'lm'
prodist(object, ..., sigma = "ML")
## S3 method for class 'glm'
prodist(object, ..., dispersion = NULL)
```

`object` |
A model object. |

`...` |
Arguments passed on to methods, typically for calling the
underlying |

`sigma` |
character or numeric or |

`dispersion` |
character or numeric or |

To facilitate making probabilistic forecasts based on regression and time
series model objects, the function `prodist`

extracts fitted or
predicted probability `distribution`

objects. Currently, methods are
provided for objects fitted by `lm`

,
`glm`

, and `arima`

in base R as
well as `glm.nb`

from the MASS package and
`hurdle`

/`zeroinfl`

/`zerotrunc`

from the pscl or
countreg packages.

All methods essentially
proceed in two steps: First, the standard `predict`

method for these model objects is used to compute fitted (in-sample, default)
or predicted (out-of-sample) distribution parameters. Typically, this includes
the mean plus further parameters describing scale, dispersion, shape, etc.).
Second, the `distributions`

objects are set up using the generator
functions from distributions3.

Note that these probability distributions only reflect the random variation in the dependent variable based on the model employed (and its associated distributional assumpation for the dependent variable). This does not capture the uncertainty in the parameter estimates.

For both linear regression models and generalized linear models, estimated
by `lm`

and `glm`

respectively, there is some ambiguity as to which
estimate for the dispersion parameter of the model is to be used. While the
`logLik`

methods use the maximum-likelihood (ML) estimate
implicitly, the `summary`

methods report an estimate that is standardized
with the residual degrees of freedom, n - k (rather than the number of
observations, n). The `prodist`

methods for these objects follow
the `logLik`

method by default but the `summary`

behavior can be
mimicked by setting the `sigma`

or `dispersion`

arguments
accordingly.

An object inheriting from `distribution`

.

```
## Model: Linear regression
## Fit: lm
## Data: 1920s cars data
data("cars", package = "datasets")
## Stopping distance (ft) explained by speed (mph)
reg <- lm(dist ~ speed, data = cars)
## Extract fitted normal distributions (in-sample, with constant variance)
pd <- prodist(reg)
head(pd)
## Extract log-likelihood from model object
logLik(reg)
## Replicate log-likelihood via distributions object
sum(log_pdf(pd, cars$dist))
log_likelihood(pd, cars$dist)
## Compute corresponding medians and 90% interval
qd <- quantile(pd, c(0.05, 0.5, 0.95))
head(qd)
## Visualize observations with predicted quantiles
plot(dist ~ speed, data = cars)
matplot(cars$speed, qd, add = TRUE, type = "l", col = 2, lty = 1)
## Sigma estimated by maximum-likelihood estimate (default, used in logLik)
## vs. least-squares estimate (used in summary)
nd <- data.frame(speed = 50)
prodist(reg, newdata = nd, sigma = "ML")
prodist(reg, newdata = nd, sigma = "OLS")
summary(reg)$sigma
## Model: Poisson generalized linear model
## Fit: glm
## Data: FIFA 2018 World Cup data
data("FIFA2018", package = "distributions3")
## Number of goals per team explained by ability differences
poisreg <- glm(goals ~ difference, data = FIFA2018, family = poisson)
summary(poisreg)
## Interpretation: When the ratio of abilities increases by 1 percent,
## the expected number of goals increases by around 0.4 percent
## Predict fitted Poisson distributions for teams with equal ability (out-of-sample)
nd <- data.frame(difference = 0)
prodist(poisreg, newdata = nd)
## Extract fitted Poisson distributions (in-sample)
pd <- prodist(poisreg)
head(pd)
## Extract log-likelihood from model object
logLik(poisreg)
## Replicate log-likelihood via distributions object
sum(log_pdf(pd, FIFA2018$goals))
log_likelihood(pd, FIFA2018$goals)
## Model: Autoregressive integrated moving average model
## Fit: arima
## Data: Quarterly approval ratings of U.S. presidents (1945-1974)
data("presidents", package = "datasets")
## ARMA(2,1) model
arma21 <- arima(presidents, order = c(2, 0, 1))
## Extract predicted normal distributions for next two years
p <- prodist(arma21, n.ahead = 8)
p
## Compute median (= mean) forecast along with 80% and 95% interval
quantile(p, c(0.5, 0.1, 0.9, 0.025, 0.975))
```

[Package *distributions3* version 0.2.1 Index]