R: Predictive Information Criteria

PIC {picR}

R Documentation

Predictive Information Criteria

Description

PIC is the S3 generic function for computing predictive information criteria (PIC). Depending on the class of the fitted model supplied to object, the function invokes the appropriate method for computing PIC.

Usage

PIC(object, newdata, ...)

Arguments

`object`	A fitted model object.
`newdata`	An optional dataframe to be used as validation data in computing PIC. If omitted, the training data contained within `object` are used.
`...`	Further arguments passed to other methods.

Details

The PIC are model selection criteria that may be used to select from among predictive models in a candidate set. The model with the minimum criterion value is preferred.

The PIC asymptotically select the candidate model that minimizes the mean squared error of prediction (MSEP), thus behaving similarly to the the Akaike Information Criterion (AIC). However in contrast to the AIC, the PIC do not assume a panel of validation data that are independent and identically distributed to the set of training data. This effectively enables the PIC to accommodate training/validation data heterogeneity, where training and validation data may differ from one another in distribution.

Data heterogeneity is arguably the more typical circumstance in practice, especially when one considers applications where a set of covariates are used to model and predict some response. In these regression contexts, one often predicts values of the response at combinations of covariate values not necessarily used in training the predictive model.

Value

The form of the value returned by PIC depends on the fitted model class and its method-specific arguments. Details may be found in the documentation of each method.

Examples

data(iris)

# Fit a regression model
mod <- lm(Sepal.Length ~ Sepal.Width + Species, data = iris)
PIC(object  = mod,
    newdata = data.frame(Sepal.Width = c(0.25, 1.74, 2.99),
                         Species = factor(c("setosa", "virginica", "virginica"),
                                          levels = c("setosa", "versicolor", "virginica"))))

# Fit a bivariable regression model
mod <- lm(cbind(Sepal.Length, Sepal.Width) ~ Species + Petal.Length, data = iris)
# Note: For multivariable models, response variable columns must be included if
#       newdata is specified. If the values of the validation response(s) are
#       unknown, specify NA. If partially observed, specify NA only where unknown.
PIC(object  = mod,
    newdata = data.frame(Sepal.Length = c(4.1, NA, NA),
                         Sepal.Width  = c(NA,NA,3.2),
                         Petal.Length = c(1.2, 3.5, 7),
                         Species = factor(c("setosa", "virginica", "virginica"),
                                          levels = c("setosa", "versicolor", "virginica"))))

[Package picR version 1.0.0 Index]