PIT_global {recalibratiNN} | R Documentation |
Obtain the PIT-values of a Model
Description
A function to calculate the Probability Integral Transform (PIT) values for any fitted model that assumes a normal distribution of the output.
Usage
PIT_global(ycal, yhat, mse)
Arguments
ycal |
Numeric vector representing the true observations (y-values) of the response variable from the calibration dataset. |
yhat |
Numeric vector of predicted y-values on the calibration dataset. |
mse |
Mean Squared Error calculated from the calibration dataset. |
Details
This function is designed to work with models that is, even implicitly, assuming normal distribution of the response variable.
This includes, but is not limited to, linear models created using lm()
or neural networks utilizing Mean Squared Error as the loss function.
The OLS method is used to minimized residuals in these models. This mathematical optimization will also yield a probabilistic optimization
when normal distribution of the response variable is assumed, since OLS and maximum likelihood estimation are equivalent under normality.
Therefore, in order to render a probabilistic interpretation of the predictions, the model is intrinsically assuming a normal distribution of the response variable.
Value
Returns a numeric vector of PIT-values.
Examples
n <- 10000
split <- 0.8
# generating heterocedastic data
mu <- function(x1){
10 + 5*x1^2
}
sigma_v <- function(x1){
30*x1
}
x <- runif(n, 1, 10)
y <- rnorm(n, mu(x), sigma_v(x))
x_train <- x[1:(n*split)]
y_train <- y[1:(n*split)]
x_cal <- x[(n*split+1):n]
y_cal <- y[(n*split+1):n]
model <- lm(y_train ~ x_train)
y_hat <- predict(model, newdata=data.frame(x_train=x_cal))
MSE_cal <- mean((y_hat - y_cal)^2)
PIT_global(ycal=y_cal, yhat=y_hat, mse=MSE_cal)