PIC.lm {picR}R Documentation

PIC method for Linear Models

Description

Computation of predictive information criteria for linear models.

Usage

## S3 method for class 'lm'
PIC(object, newdata, group_sizes = NULL, bootstraps = NULL, ...)

Arguments

object

A fitted model object of class "lm".

newdata

An optional dataframe to be used as validation data in computing PIC. If omitted, the training data contained within object are used.

group_sizes

An optional scalar or numeric vector indicating the sizes of newdata partitions. If omitted, newdata is not partitioned. See 'Details'.

bootstraps

An optional numeric value indicating the number of bootstrap samples to use for a bootstrapped PIC. See 'Details'.

...

Further arguments passed to or from other methods.

Details

PIC.lm computes PIC values based on the supplied regression model. Candidate models with relatively smaller criterion values are preferred. Depending on the value(s) supplied to group_sizes, one of three implementations of PIC are computed:

If a numeric value is supplied to bootstraps the total Predictive information criterion (tPIC) is computed bootstraps times, where generated bootstrap samples are each used as sets of validation data in computing the tPIC. The resulting tPIC values are then averaged to generate a single, bootstrapped tPIC value. Model selection based on this bootstrapped tPIC value may lead to the selection of a more generally applicable predictive model whose predictive accuracy is not strictly optimized to a particular set of validation data.

For further details, see A new class of information criteria for improved prediction in the presence of training/validation data heterogeneity.

Value

If group_sizes = NULL or bootstraps > 0, a scalar is returned. Otherwise, newdata is returned with an appended column labeled 'PIC' containing either iPIC or gPIC values, depending on the value provided to group_sizes.

References

Flores, J.E. (2021), A new class of information criteria for improved prediction in the presence of training/validation data heterogeneity [Unpublished PhD dissertation]. University of Iowa.

See Also

PIC, PIC.mlm, lm

Examples

data(iris)

# Fit a regression model
mod <- lm(Sepal.Length ~ ., data = iris)
class(mod)

# Hypothetical validation data
set.seed(1)
vdat <- iris[sample(1:nrow(iris), 10),]

# tPIC, newdata not supplied
PIC(object = mod)
AIC(mod) # equivalent to PIC since training and validation data are the same above

# tPIC, newdata supplied
PIC(object = mod, newdata = vdat)
AIC(mod) # not equivalent to PIC since training and validation data differ above

# gPIC
PIC(object = mod, newdata = vdat, group_sizes = c(5,3,2))
PIC(object = mod, newdata = vdat, group_sizes = 5)

# iPIC
PIC(object = mod, newdata = vdat, group_sizes = rep(1, 10))
PIC(object = mod, newdata = vdat, group_sizes = 1)

# bootstrapped tPIC (based on 10 bootstrap samples)
set.seed(1)
PIC(object = mod, bootstraps = 10)


[Package picR version 1.0.0 Index]