PIC {picR} | R Documentation |
Predictive Information Criteria
Description
PIC
is the S3 generic function for computing predictive information criteria (PIC).
Depending on the class
of the fitted model supplied to object
, the function
invokes the appropriate method for computing PIC.
Usage
PIC(object, newdata, ...)
Arguments
object |
A fitted model object. |
newdata |
An optional dataframe to be used as validation data in computing PIC. If omitted, the training data contained within |
... |
Further arguments passed to other methods. |
Details
The PIC are model selection criteria that may be used to select from among predictive models in a candidate set. The model with the minimum criterion value is preferred.
The PIC asymptotically select the candidate model that minimizes the mean squared error of prediction (MSEP), thus behaving similarly to the the Akaike Information Criterion (AIC). However in contrast to the AIC, the PIC do not assume a panel of validation data that are independent and identically distributed to the set of training data. This effectively enables the PIC to accommodate training/validation data heterogeneity, where training and validation data may differ from one another in distribution.
Data heterogeneity is arguably the more typical circumstance in practice, especially when one considers applications where a set of covariates are used to model and predict some response. In these regression contexts, one often predicts values of the response at combinations of covariate values not necessarily used in training the predictive model.
Value
The form of the value returned by PIC
depends on the fitted model class and its method-specific arguments.
Details may be found in the documentation of each method.
See Also
Examples
data(iris)
# Fit a regression model
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris)
PIC(object = mod,
newdata = data.frame(Sepal.Width = c(0.25, 1.74, 2.99),
Species = factor(c("setosa", "virginica", "virginica"),
levels = c("setosa", "versicolor", "virginica"))))
# Fit a bivariable regression model
mod <- lm(cbind(Sepal.Length, Sepal.Width) ~ Species + Petal.Length, data = iris)
# Note: For multivariable models, response variable columns must be included if
# newdata is specified. If the values of the validation response(s) are
# unknown, specify NA. If partially observed, specify NA only where unknown.
PIC(object = mod,
newdata = data.frame(Sepal.Length = c(4.1, NA, NA),
Sepal.Width = c(NA,NA,3.2),
Petal.Length = c(1.2, 3.5, 7),
Species = factor(c("setosa", "virginica", "virginica"),
levels = c("setosa", "versicolor", "virginica"))))