R: Partial dependence plot with specific numerical precision

pdpEst_mpfr {IADT}

R Documentation

Partial dependence plot with specific numerical precision

Description

Estimates the partial dependence plot (PDP) curve given specified numerical precision.

Usage

pdpEst_mpfr(
  colInd,
  object,
  predictfun,
  X,
  centering = FALSE,
  outputVector = TRUE,
  newX = NULL,
  nCores = 1,
  precBits = 53 * 2
)

Arguments

`colInd`	Index of columns of covariates to specify the null hypothesis set s (integer vector).
`object`	Prediction model object (class flexible).
`predictfun`	Prediction function to be evaluated (class function). The prediction function needs to be specified with two arguments predictfun(object, X). The argument object is the prediction model and X the data on which the partial dependence functions are evaluated.
`X`	Data on that the partial dependence function is evaluated (class matrix or data.frame). The structure of the data depends on the specified argument predictfun.
`centering`	Should the resulting values be mean centered? (logical scalar). Default corresponds to output original values.
`outputVector`	Should be only the partial dependence function returned
`newX`	Test data set (class "data.frame")
`nCores`	Number of cores used in standard parallel computation setup based on R parallel package. The default value of one uses serial processing across observations.
`precBits`	Numerical precision that are used in computation after the calculation of the predictions from the estimated model. Default is defined to be double the amount of the 53 Bits usually used in R.

Value

Vector of estimated the PDP curve values for each sample in X.

Author(s)

Thomas Welchowski welchow@imbie.meb.uni-bonn.de

References

Friedman JH, Popescu BE (2008). “Predictive learning via rule ensembles.” The Annals of Applied Statistics, 2(3), 916-954.

Examples


#####################
# Simulation example

# Simulate covariates from multivariate standard normal distribution
set.seed(-72498)
library(mvnfast)
X <- mvnfast::rmvn(n=1e2, mu=rep(0, 2), sigma=diag(2))

# Response generation
y <- X[, 1]^2 + rnorm(n=1e2, mean=0, sd=0.5)
trainDat <- data.frame(X, y=y)

# Estimate generalized additive model
library(mgcv)
gamFit <- gam(formula=y~s(X1)+s(X2), data=trainDat, 
family=gaussian())

# Estimate PDP function
pdpEst1 <- pdpEst_mpfr(colInd=1, object=gamFit, 
predictfun=function(object, X){
predict(object=object, newdata=X, type="response")
}, X=trainDat, 
centering=FALSE, nCores=1, precBits=53*2)

# Convert to standard precision and order in sequence of observations
pdpEst1 <- as.numeric(pdpEst1)
ordInd <- order(X[, 1])
pdpEst1 <- pdpEst1[ordInd]

# Plot: PDP curve vs. true effect
plot(x=X[ordInd, 1], y=pdpEst1, type="l")
lines(x=X[ordInd, 1], y=X[ordInd, 1]^2, lty=2, col="red")
# -> Both curves are similiar

[Package IADT version 1.2.1 Index]