R: Prediction for Robust GaSP model

predict.rgasp {RobustGaSP}

R Documentation

Prediction for Robust GaSP model

Description

Function to make prediction on the robust GaSP model after the robust GaSP model has been constructed.

Usage

## S4 method for signature 'rgasp'
predict(object,testing_input,testing_trend= matrix(1,dim(testing_input)[1],1),
r0=NA,interval_data=T,
outasS3 = T,...)

Arguments

`object`	an object of class `rgasp`.
`testing_input`	a matrix containing the inputs where the `rgasp` is to perform prediction.
`testing_trend`	a matrix of mean/trend for prediction.
`r0`	the distance between input and testing input. If the value is `NA`, it will be calculated later. It can also be specified by the user. If specified by user, it is either a `matrix` or `list`. The default value is `NA`.
`interval_data`	a boolean value. If `T`, the interval of the data will be calculated. Otherwise, the interval of the mean of the data will be calculted.
`outasS3`	a boolean parameter indicating whether the output of the function should be as an `S3 object`.
`...`	Extra arguments to be passed to the function (not implemented yet).

Value

If the parameter outasS3=F, then the returned value is a S4 object of class predrgasp-class with

call:: call to predict.rgasp function where the returned object has been created.
mean:: predictive mean for the testing inputs.
lower95:: lower bound of the 95% posterior credible interval.
upper95:: upper bound of the 95% posterior credible interval.
sd:: standard deviation of each testing_input.

If the parameter outasS3=T, then the returned value is a list with

`mean`	predictive mean for the testing inputs.
`lower95`	lower bound of the 95% posterior credible interval.
`upper95`	upper bound of the 95% posterior credible interval.
`sd`	standard deviation of each `testing_input`.

Author(s)

Mengyang Gu [aut, cre], Jesus Palomo [aut], James Berger [aut]

Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>

References

M. Gu. (2016). Robust Uncertainty Quantification and Scalable Computation for Computer Models with Massive Output. Ph.D. thesis. Duke University.

M. Gu. and J.O. Berger (2016). Parallel partial Gaussian process emulation for computer models with massive output. Annals of Applied Statistics, 10(3), 1317-1347.

M. Gu, X. Wang and J.O. Berger (2018), Robust Gaussian Stochastic Process Emulation, Annals of Statistics, 46(6A), 3038-3066.

M. Gu (2018), Jointly Robust Prior for Gaussian Stochastic Process in Emulation, Calibration and Variable Selection, arXiv:1804.09329.

Examples

  #------------------------
  # a 3 dimensional example
  #------------------------
  # dimensional of the inputs
  dim_inputs <- 3    
  # number of the inputs
  num_obs <- 30       
  # uniform samples of design
  input <- matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs) 
  
  # Following codes use maximin Latin Hypercube Design, which is typically better than uniform
  # library(lhs)
  # input <- maximinLHS(n=num_obs, k=dim_inputs)  ##maximin lhd sample
  
  # outputs from the 3 dim dettepepel.3.data function
  
  output = matrix(0,num_obs,1)
  for(i in 1:num_obs){
    output[i]<-dettepepel.3.data (input[i,])
  }
  
  # use constant mean basis, with no constraint on optimization
  m1<- rgasp(design = input, response = output, lower_bound=FALSE)
  
  # the following use constraints on optimization
  # m1<- rgasp(design = input, response = output, lower_bound=TRUE)
  
  # the following use a single start on optimization
  # m1<- rgasp(design = input, response = output, lower_bound=FALS)
  
  # number of points to be predicted 
  num_testing_input <- 5000    
  # generate points to be predicted
  testing_input <- matrix(runif(num_testing_input*dim_inputs),num_testing_input,dim_inputs)
  # Perform prediction
  m1.predict<-predict(m1, testing_input)
  # Predictive mean
  # m1.predict$mean  
  
  # The following tests how good the prediction is 
  testing_output <- matrix(0,num_testing_input,1)
  for(i in 1:num_testing_input){
    testing_output[i]<-dettepepel.3.data(testing_input[i,])
  }
  
  # compute the MSE, average coverage and average length
  # out of sample MSE
  MSE_emulator <- sum((m1.predict$mean-testing_output)^2)/(num_testing_input)  
  
  # proportion covered by 95% posterior predictive credible interval
  prop_emulator <- length(which((m1.predict$lower95<=testing_output)
                   &(m1.predict$upper95>=testing_output)))/num_testing_input
  
  # average length of  posterior predictive credible interval
  length_emulator <- sum(m1.predict$upper95-m1.predict$lower95)/num_testing_input
  
  # output of prediction
  MSE_emulator
  prop_emulator
  length_emulator  
  # normalized RMSE
  sqrt(MSE_emulator/mean((testing_output-mean(output))^2 ))


  #-----------------------------------
  # a 2 dimensional example with trend
  #-----------------------------------
  # dimensional of the inputs
  dim_inputs <- 2    
  # number of the inputs
  num_obs <- 20       
  
  # uniform samples of design
  input <-matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs) 
  # Following codes use maximin Latin Hypercube Design, which is typically better than uniform
  # library(lhs)
  # input <- maximinLHS(n=num_obs, k=dim_inputs)  ##maximin lhd sample
  
  # outputs from the 2 dim Brainin function
  
  output <- matrix(0,num_obs,1)
  for(i in 1:num_obs){
    output[i]<-limetal.2.data (input[i,])
  }
  
  #mean basis (trend)
  X<-cbind(rep(1,num_obs), input )
  
  
  # use constant mean basis with trend, with no constraint on optimization
  m2<- rgasp(design = input, response = output,trend =X,  lower_bound=FALSE)
  
  
  # number of points to be predicted 
  num_testing_input <- 5000    
  # generate points to be predicted
  testing_input <- matrix(runif(num_testing_input*dim_inputs),num_testing_input,dim_inputs)
  
  # trend of testing
  testing_X<-cbind(rep(1,num_testing_input), testing_input )
  
  
  # Perform prediction
  m2.predict<-predict(m2, testing_input,testing_trend=testing_X)
  # Predictive mean
  #m2.predict$mean  
  
  # The following tests how good the prediction is 
  testing_output <- matrix(0,num_testing_input,1)
  for(i in 1:num_testing_input){
    testing_output[i]<-limetal.2.data(testing_input[i,])
  }
  
  # compute the MSE, average coverage and average length
  # out of sample MSE
  MSE_emulator <- sum((m2.predict$mean-testing_output)^2)/(num_testing_input)  
  
  # proportion covered by 95% posterior predictive credible interval
  prop_emulator <- length(which((m2.predict$lower95<=testing_output)
                   &(m2.predict$upper95>=testing_output)))/num_testing_input
  
  # average length of  posterior predictive credible interval
  length_emulator <- sum(m2.predict$upper95-m2.predict$lower95)/num_testing_input
  
  # output of prediction
  MSE_emulator
  prop_emulator
  length_emulator  
  # normalized RMSE
  sqrt(MSE_emulator/mean((testing_output-mean(output))^2 ))


    ###here try the isotropic kernel (a function of Euclidean distance)
  m2_isotropic<- rgasp(design = input, response = output,trend =X,  
             lower_bound=FALSE,isotropic=TRUE)
  
  m2_isotropic.predict<-predict(m2_isotropic, testing_input,testing_trend=testing_X)
  
  # compute the MSE, average coverage and average length
  # out of sample MSE
  MSE_emulator_isotropic <- sum((m2_isotropic.predict$mean-testing_output)^2)/(num_testing_input)
  
  # proportion covered by 95% posterior predictive credible interval
  prop_emulator_isotropic <- length(which((m2_isotropic.predict$lower95<=testing_output)
                                &(m2_isotropic.predict$upper95>=testing_output)))/num_testing_input
  
  # average length of  posterior predictive credible interval
  length_emulator_isotropic <- sum(m2_isotropic.predict$upper95-
  m2_isotropic.predict$lower95)/num_testing_input
  
  MSE_emulator_isotropic
  prop_emulator_isotropic
  length_emulator_isotropic
  ##the result of isotropic kernel is not as good as the product kernel for this example


  #--------------------------------------------------------------------------------------
  # an 8 dimensional example using only a subset inputs and a noise with unknown variance
  #--------------------------------------------------------------------------------------
  set.seed(1)
  # dimensional of the inputs
  dim_inputs <- 8    
  # number of the inputs
  num_obs <- 50       
  
  # uniform samples of design
  input <-matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs) 
  # Following codes use maximin Latin Hypercube Design, which is typically better than uniform
  # library(lhs)
  # input <- maximinLHS(n=num_obs, k=dim_inputs)  # maximin lhd sample
  
  # rescale the design to the domain
  input[,1]<-0.05+(0.15-0.05)*input[,1];
  input[,2]<-100+(50000-100)*input[,2];
  input[,3]<-63070+(115600-63070)*input[,3];
  input[,4]<-990+(1110-990)*input[,4];
  input[,5]<-63.1+(116-63.1)*input[,5];
  input[,6]<-700+(820-700)*input[,6];
  input[,7]<-1120+(1680-1120)*input[,7];
  input[,8]<-9855+(12045-9855)*input[,8];
  
  # outputs from the 8 dim Borehole function
  
  output=matrix(0,num_obs,1)
  for(i in 1:num_obs){
    output[i]=borehole(input[i,])
  }
  
  
    
    
  
  # use constant mean basis with trend, with no constraint on optimization
  m3<- rgasp(design = input[,c(1,4,6,7,8)], response = output,
             nugget.est=TRUE, lower_bound=FALSE)
  
  
  # number of points to be predicted 
  num_testing_input <- 5000    
  # generate points to be predicted
  testing_input <- matrix(runif(num_testing_input*dim_inputs),num_testing_input,dim_inputs)
  
  # resale the points to the region to be predict
  testing_input[,1]<-0.05+(0.15-0.05)*testing_input[,1];
  testing_input[,2]<-100+(50000-100)*testing_input[,2];
  testing_input[,3]<-63070+(115600-63070)*testing_input[,3];
  testing_input[,4]<-990+(1110-990)*testing_input[,4];
  testing_input[,5]<-63.1+(116-63.1)*testing_input[,5];
  testing_input[,6]<-700+(820-700)*testing_input[,6];
  testing_input[,7]<-1120+(1680-1120)*testing_input[,7];
  testing_input[,8]<-9855+(12045-9855)*testing_input[,8];
  
  
  # Perform prediction
  m3.predict<-predict(m3, testing_input[,c(1,4,6,7,8)])
  # Predictive mean
  #m3.predict$mean  
  
  # The following tests how good the prediction is 
  testing_output <- matrix(0,num_testing_input,1)
  for(i in 1:num_testing_input){
    testing_output[i]<-borehole(testing_input[i,])
  }
  
  # compute the MSE, average coverage and average length
  # out of sample MSE
  MSE_emulator <- sum((m3.predict$mean-testing_output)^2)/(num_testing_input)  
  
  # proportion covered by 95% posterior predictive credible interval
  prop_emulator <- length(which((m3.predict$lower95<=testing_output)
                   &(m3.predict$upper95>=testing_output)))/num_testing_input
  
  # average length of  posterior predictive credible interval
  length_emulator <- sum(m3.predict$upper95-m3.predict$lower95)/num_testing_input
  
  # output of sample prediction
  MSE_emulator
  prop_emulator
  length_emulator  
  # normalized RMSE
  sqrt(MSE_emulator/mean((testing_output-mean(output))^2 ))

[Package RobustGaSP version 0.6.6 Index]