pdpEst_mpfr {IADT}R Documentation

Partial dependence plot with specific numerical precision

Description

Estimates the partial dependence plot (PDP) curve given specified numerical precision.

Usage

pdpEst_mpfr(
  colInd,
  object,
  predictfun,
  X,
  centering = FALSE,
  outputVector = TRUE,
  newX = NULL,
  nCores = 1,
  precBits = 53 * 2
)

Arguments

colInd

Index of columns of covariates to specify the null hypothesis set s (integer vector).

object

Prediction model object (class flexible).

predictfun

Prediction function to be evaluated (class function). The prediction function needs to be specified with two arguments predictfun(object, X). The argument object is the prediction model and X the data on which the partial dependence functions are evaluated.

X

Data on that the partial dependence function is evaluated (class matrix or data.frame). The structure of the data depends on the specified argument predictfun.

centering

Should the resulting values be mean centered? (logical scalar). Default corresponds to output original values.

outputVector

Should be only the partial dependence function returned

newX

Test data set (class "data.frame")

nCores

Number of cores used in standard parallel computation setup based on R parallel package. The default value of one uses serial processing across observations.

precBits

Numerical precision that are used in computation after the calculation of the predictions from the estimated model. Default is defined to be double the amount of the 53 Bits usually used in R.

Value

Vector of estimated the PDP curve values for each sample in X.

Author(s)

Thomas Welchowski welchow@imbie.meb.uni-bonn.de

References

Friedman JH, Popescu BE (2008). “Predictive learning via rule ensembles.” The Annals of Applied Statistics, 2(3), 916-954.

See Also

testIAD_mpfr

Examples


#####################
# Simulation example

# Simulate covariates from multivariate standard normal distribution
set.seed(-72498)
library(mvnfast)
X <- mvnfast::rmvn(n=1e2, mu=rep(0, 2), sigma=diag(2))

# Response generation
y <- X[, 1]^2 + rnorm(n=1e2, mean=0, sd=0.5)
trainDat <- data.frame(X, y=y)

# Estimate generalized additive model
library(mgcv)
gamFit <- gam(formula=y~s(X1)+s(X2), data=trainDat, 
family=gaussian())

# Estimate PDP function
pdpEst1 <- pdpEst_mpfr(colInd=1, object=gamFit, 
predictfun=function(object, X){
predict(object=object, newdata=X, type="response")
}, X=trainDat, 
centering=FALSE, nCores=1, precBits=53*2)

# Convert to standard precision and order in sequence of observations
pdpEst1 <- as.numeric(pdpEst1)
ordInd <- order(X[, 1])
pdpEst1 <- pdpEst1[ordInd]

# Plot: PDP curve vs. true effect
plot(x=X[ordInd, 1], y=pdpEst1, type="l")
lines(x=X[ordInd, 1], y=X[ordInd, 1]^2, lty=2, col="red")
# -> Both curves are similiar


[Package IADT version 1.2.1 Index]