rgasp {RobustGaSP} | R Documentation |
Setting up the robust GaSP model
Description
Setting up the robust GaSP model for estimating the parameters (if the parameters are not given).
Usage
rgasp(design, response,trend=matrix(1,length(response),1),zero.mean="No",nugget=0,
nugget.est=F,range.par=NA,method='post_mode',prior_choice='ref_approx',a=0.2,
b=1/(length(response))^{1/dim(as.matrix(design))[2]}*(a+dim(as.matrix(design))[2]),
kernel_type='matern_5_2',isotropic=F,R0=NA,
optimization='lbfgs', alpha=rep(1.9,dim(as.matrix(design))[2]),
lower_bound=T,max_eval=max(30,20+5*dim(design)[2]),
initial_values=NA,num_initial_values=2)
Arguments
design |
a matrix of inputs. |
response |
a matrix of outputs. |
trend |
the mean/trend matrix of inputs. The default value is a vector of ones. |
zero.mean |
it has zero mean or not. The default value is |
nugget |
numerical value of the nugget variance ratio. If nugget is equal to 0, it means there is either no nugget or the nugget is estimated. If the nugget is not equal to 0, it means a fixed nugget. The default value is 0. |
nugget.est |
boolean value. |
range.par |
either |
method |
method of parameter estimation. |
prior_choice |
the choice of prior for range parameters and noise-variance parameters. |
a |
prior parameters in the jointly robust prior. The default value is 0.2. |
b |
prior parameters in the jointly robust prior. The default value is |
kernel_type |
A vector specifying the type of kernels of each coordinate of the input. |
isotropic |
a boolean value. |
R0 |
the distance between inputs. If the value is |
optimization |
the method for numerically optimization of the kernel parameters. Currently three methods are implemented. |
alpha |
roughness parameters in the |
lower_bound |
boolean value. |
max_eval |
the maximum number of steps to estimate the range and nugget parameters. |
initial_values |
a matrix of initial values of the kernel parameters to be optimized numerically, where each row of the matrix contains a set of the log inverse range parameters and the log nugget parameter. |
num_initial_values |
the number of initial values of the kernel parameters in optimization. |
Value
rgasp
returns a S4 object of class rgasp
(see rgasp-class
).
Author(s)
Mengyang Gu [aut, cre], Jesus Palomo [aut], James Berger [aut]
Maintainer: Mengyang Gu <mengyang@pstat.ucsb.edu>
References
M. Gu, X. Wang and J.O. Berger (2018), Robust Gaussian stochastic process emulation, Annals of Statistics, 46(6A), 3038-3066.
M. Gu (2018), Jointly robust prior for Gaussian stochastic process in emulation, calibration and variable selection, arXiv:1804.09329.
M. Gu. (2016). Robust uncertainty quantification and scalable computation for computer models with massive output. Ph.D. thesis. Duke University.
M. Gu. and J.O. Berger (2016). Parallel partial Gaussian process emulation for computer models with massive output. Annals of Applied Statistics, 10(3), 1317-1347.
E.T. Spiller, M.J. Bayarri, J.O. Berger and E.S. Calder and A.K. Patra and E.B. Pitman, and R.L. Wolpert (2014), Automating emulator construction for geophysical hazard maps. SIAM/ASA Journal on Uncertainty Quantification, 2(1), 126-152.
J. Nocedal (1980), Updating quasi-Newton matrices with limited storage, Math. Comput., 35, 773-782.
D. C. Liu and J. Nocedal (1989), On the limited memory BFGS method for large scale optimization, Math. Programming, 45, p. 503-528.
Brent, R. (1973), Algorithms for Minimization without Derivatives. Englewood Cliffs N.J.: Prentice-Hall.
Examples
library(RobustGaSP)
#------------------------
# a 3 dimensional example
#------------------------
# dimensional of the inputs
dim_inputs <- 3
# number of the inputs
num_obs <- 50
# uniform samples of design
input <- matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs)
# Following codes use maximin Latin Hypercube Design, which is typically better than uniform
# library(lhs)
# input <- maximinLHS(n=num_obs, k=dim_inputs) ##maximin lhd sample
####
# outputs from the 3 dim dettepepel.3.data function
output = matrix(0,num_obs,1)
for(i in 1:num_obs){
output[i]<-dettepepel.3.data (input[i,])
}
# use constant mean basis, with no constraint on optimization
# and marginal posterior mode estimation
m1<- rgasp(design = input, response = output, lower_bound=FALSE)
# you can use specify the estimation as maximum likelihood estimation (MLE)
m2<- rgasp(design = input, response = output, method='mle',lower_bound=FALSE)
##let's do some comparison on prediction
n_testing=1000
testing_input=matrix(runif(n_testing*dim_inputs),n_testing,dim_inputs)
m1_pred=predict(m1,testing_input=testing_input)
m2_pred=predict(m2,testing_input=testing_input)
##root of mean square error and interval
test_output = matrix(0,n_testing,1)
for(i in 1:n_testing){
test_output[i]<-dettepepel.3.data (testing_input[i,])
}
##root of mean square error
sqrt(mean( (m1_pred$mean-test_output)^2))
sqrt(mean( (m2_pred$mean-test_output)^2))
sd(test_output)
#---------------------------------------
# a 1 dimensional example with zero mean
#---------------------------------------
input=10*seq(0,1,1/14)
output<-higdon.1.data(input)
#the following code fit a GaSP with zero mean by setting zero.mean="Yes"
model<- rgasp(design = input, response = output, zero.mean="Yes")
model
testing_input = as.matrix(seq(0,10,1/100))
model.predict<-predict(model,testing_input)
names(model.predict)
#########plot predictive distribution
testing_output=higdon.1.data(testing_input)
plot(testing_input,model.predict$mean,type='l',col='blue',
xlab='input',ylab='output')
polygon( c(testing_input,rev(testing_input)),c(model.predict$lower95,
rev(model.predict$upper95)),col = "grey80", border = FALSE)
lines(testing_input, testing_output)
lines(testing_input,model.predict$mean,type='l',col='blue')
lines(input, output,type='p')
## mean square erros
mean((model.predict$mean-testing_output)^2)
#-----------------------------------
# a 2 dimensional example with trend
#-----------------------------------
# dimensional of the inputs
dim_inputs <- 2
# number of the inputs
num_obs <- 20
# uniform samples of design
input <-matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs)
# Following codes use maximin Latin Hypercube Design, which is typically better than uniform
# library(lhs)
# input <- maximinLHS(n=num_obs, k=dim_inputs) # maximin lhd sample
# outputs from a 2 dim function
output <- matrix(0,num_obs,1)
for(i in 1:num_obs){
output[i]<-limetal.2.data (input[i,])
}
####trend or mean basis
X<-cbind(rep(1,num_obs), input )
# use constant mean basis with trend, with no constraint on optimization
m2<- rgasp(design = input, response = output,trend =X, lower_bound=FALSE)
show(m2) # show this rgasp object
m2@beta_hat # estimated inverse range parameters
m2@theta_hat # estimated trend parameters
#--------------------------------------------------------------------------------------
# an 8 dimensional example using only a subset inputs and a noise with unknown variance
#--------------------------------------------------------------------------------------
set.seed(1)
# dimensional of the inputs
dim_inputs <- 8
# number of the inputs
num_obs <- 50
# uniform samples of design
input <-matrix(runif(num_obs*dim_inputs), num_obs,dim_inputs)
# Following codes use maximin Latin Hypercube Design, which is typically better than uniform
# library(lhs)
# input <- maximinLHS(n=num_obs, k=dim_inputs) # maximin lhd sample
# rescale the design to the domain
input[,1]<-0.05+(0.15-0.05)*input[,1];
input[,2]<-100+(50000-100)*input[,2];
input[,3]<-63070+(115600-63070)*input[,3];
input[,4]<-990+(1110-990)*input[,4];
input[,5]<-63.1+(116-63.1)*input[,5];
input[,6]<-700+(820-700)*input[,6];
input[,7]<-1120+(1680-1120)*input[,7];
input[,8]<-9855+(12045-9855)*input[,8];
# outputs from the 8 dim Borehole function
output=matrix(0,num_obs,1)
for(i in 1:num_obs){
output[i]=borehole(input[i,])
}
# use constant mean basis with trend, with no constraint on optimization
m3<- rgasp(design = input[,c(1,4,6,7,8)], response = output,
nugget.est=TRUE, lower_bound=FALSE)
m3@beta_hat # estimated inverse range parameters
m3@nugget