EGI {KrigInv}R Documentation

Efficient Global Inversion: sequential inversion algorithm based on Kriging.

Description

Sequential sampling based on the optimization of a kriging-based criterion, with model update after each iteration. The criterias aim at identifying an excursion set or one/many level sets. At each iteration batchsize = 1 new locations are evaluated. Different criteria are available for selecting experiments. The pointwise criteria are "bichon", "ranjan", "tmse", "tsee" and are fast to compute. In addition, integral criteria require numerical integration and can potentially deliver more than one new location per iteration. Available integral criteria are "imse", "timse", "sur", "jn", "vorob", "vorobCons", "vorobVol". The use of the integral criteria is implemented in the EGIparallel function.

Usage

EGI(T, model, method = NULL, method.param = NULL,
fun, iter, lower, upper, new.noise.var = 0,
optimcontrol = NULL, kmcontrol = NULL, integcontrol = NULL, ...)

Arguments

T

Array containing one or several thresholds. The criteria which can be used with multiple thresholds are "tmse", "timse", "sur", "jn".

model

A Kriging model of km class.

method

Criterion used for choosing observations.

method.param

Optional method parameters. For methods

  • "ranjan", "bichon",
    "tmse" and "timse": the tolerance value (scalar). If not provided, default value is used (1 for ranjan and bichon, 0 for tmse and timse).

  • "vorobev": a list containing penalization (scalar, default=1), type I penalization, and typeEx,(character, default=">") either ">" or "<" denoting the type of excursion.

  • "vorobCons" and "vorobVol": a list containing penalization (scalar, default =1), typeEx (character, default = ">"), consLevel (scalar, default=0.95), n_discrete_design (scalar, default=500*model@d), design (data.frame, default=as.data.frame(sobol (n = method.param$n_discrete_design, dim = model@d)) ), pred (list resulting from predict.km). See also the arguments alpha, pred, design, type from the function conservativeEstimate, package anMC, for more details.

fun

Objective function.

iter

Number of iterations (i.e. number of additional sampling points).

lower

Vector containing the lower bounds of the design space.

upper

Vector containing the upper bounds of the design space.

new.noise.var

Optional scalar value of the noise variance of the new observations.

optimcontrol

Optional list of control parameters for the optimization of the sampling criterion. The field method defines which optimization method is used: it can be either "genoud" (default) for an optimisation using the genoud algorithm, or "discrete" for an optimisation over a specified discrete set. If the field method is set to "genoud", one can set some parameters of this algorithm: pop.size (default : 50*d), max.generations (default : 10*d),
wait.generations (2), BFGSburnin (2) and the mutations P1, P2, up to P9 (see genoud). Numbers into brackets are the default values. If the field method is set to "discrete", one can set the field optim.points: p * d matrix corresponding to the p points where the criterion will be evaluated. If nothing is specified, 100*d points are chosen randomly.

kmcontrol

Optional list representing the control variables for the re-estimation of the kriging model once new points are sampled. The items are the same as in km.

integcontrol

Optional list specifying the procedure to build the integration points and weights, relevant only for the sampling criteria based on numerical integration:
("imse", "timse", "sur" or "jn"). Many options are possible. A) If nothing is specified, 100*d points are chosen using the Sobol sequence. B) One can directly set the field integration.points (a p * d matrix) for prespecified integration points. In this case these integration points and the corresponding vector integration.weights will be used for all the iterations of the algorithm. C) If the field integration.points is not set then the integration points are renewed at each iteration. In that case one can control the number of integration points n.points (default: 100*d) and a specific distribution distrib. Possible values for distrib are: "sobol", "MC", "timse", "imse", "sur" and "jn" (default: "sobol"). C.1) The choice "sobol" corresponds to integration points chosen with the Sobol sequence in dimension d (uniform weight). C.2) The choice "MC" corresponds to points chosen randomly, uniformly on the domain. C.3) The choices "timse", "imse", "sur" and "jn" correspond to importance sampling distributions (unequal weights). It is strongly recommended to use the importance sampling distribution corresponding to the chosen sampling criterion. When important sampling procedures are chosen, n.points points are chosen using importance sampling among a discrete set of n.candidates points (default: n.points*10) which are distributed according to a distribution
init.distrib (default: "sobol"). Possible values for init.distrib are the space filling distributions "sobol" and "MC" or an user defined distribution "spec". The "sobol" and "MC" choices correspond to quasi random and random points in the domain. If the "spec" value is chosen the user must fill in manually the field init.distrib.spec to specify himself a n.candidates * d matrix of points in dimension d.

...

Other arguments of the target function fun.

Details

The function used to build the integration points and weights (based on the options specified in integcontrol) is the function integration_design

Value

A list with components:

par

The added observations (ite * d matrix)

value

The value of the function fun at the added observations (vector of size "ite")

nsteps

The number of added observations (=ite).

lastmodel

The current (last) kriging model of km class.

lastvalue

The value of the criterion at the last added point.

allvalues

If an optimization on a discrete set of points is chosen, the value of the criterion at all these points, for the last iteration.

If method="vorobCons" or method="vorobVol" the list also has components:

current.CE

Conservative estimate computed on lastmodel.

allCE_lvs

The conservative estimate levels computed at each iteration.

Author(s)

Clement Chevalier (University of Neuchatel, Switzerland)

Victor Picheny (INRA, Toulouse, France)

David Ginsbourger (IDIAP Martigny and University of Bern, Switzerland)

Dario Azzimonti (IDSIA, Switzerland)

References

Azzimonti, D., Ginsbourger, D., Chevalier, C., Bect, J., and Richet, Y. (2018). Adaptive design of experiments for conservative estimation of excursion sets. Under revision. Preprint at hal-01379642

Chevalier C., Bect J., Ginsbourger D., Vazquez E., Picheny V., Richet Y. (2014), Fast parallel kriging-based stepwise uncertainty reduction with application to the identification of an excursion set, Technometrics, vol. 56(4), pp 455-465

Picheny V., Ginsbourger D., Roustant O., Haftka R.T., (2010) Adaptive designs of experiments for accurate approximation of a target region, J. Mech. Des. vol. 132(7)

Chevalier C. (2013) Fast uncertainty reduction strategies relying on Gaussian process models Ph.D Thesis, University of Bern

Bichon B.J., Eldred M.S., Swiler L.P., Mahadevan S., McFarland J.M. (2008) Efficient global reliability analysis for nonlinear implicit performance functions, AIAA Journal 46(10), pp 2459-2468

Ranjan P., Bingham D., Michailidis G. (2008) Sequential experiment design for contour estimation from complex computer codes Technometrics 50(4), pp 527-541

See Also

EGIparallel,max_infill_criterion

Examples

#EGI

set.seed(9)
N <- 20 #number of observations
T <- 80 #threshold
testfun <- branin
lower <- c(0,0)
upper <- c(1,1)

#a 20 points initial design
design <- data.frame( matrix(runif(2*N),ncol=2) )
response <- testfun(design)

#km object with matern3_2 covariance
#params estimated by ML from the observations
model <- km(formula=~., design = design,
	response = response,covtype="matern3_2")

optimcontrol <- list(method="genoud",pop.size=50)
integcontrol <- list(distrib="sur",n.points=50)
iter <- 1

## Not run: 
obj1 <- EGI(T=T,model=model,method="sur",fun=testfun,iter=iter,
           lower=lower,upper=upper,optimcontrol=optimcontrol,
           integcontrol=integcontrol)

obj2 <- EGI(T=T,model=model,method="ranjan",fun=testfun,iter=iter,
           lower=lower,upper=upper,optimcontrol=optimcontrol)


par(mfrow=c(1,3))
print_uncertainty_2d(model=model,T=T,main="probability of excursion",
type="pn",new.points=0,cex.points=2)

print_uncertainty_2d(model=obj1$lastmodel,T=T,
main="updated probability of excursion, sur sampling",
type="pn",new.points=iter,col.points.end="red",cex.points=2)

print_uncertainty_2d(model=obj2$lastmodel,T=T,
main="updated probability of excursion, ranjan sampling",
type="pn",new.points=iter,col.points.end="red",cex.points=2)

## End(Not run)
##############
#same example with noisy initial observations and noisy new observations
branin.noise <- function(x) return(branin(x)+rnorm(n=1,sd=30))

set.seed(9)
N <- 20;T <- 80
testfun <- branin.noise
lower <- c(0,0);upper <- c(1,1)

design <- data.frame( matrix(runif(2*N),ncol=2) )
response.noise <- apply(design,1,testfun)
response.noise - response


model.noise <- km(formula=~., design = design, response = response.noise,
covtype="matern3_2",noise.var=rep(30*30,times=N))

optimcontrol <- list(method="genoud",pop.size=50)
integcontrol <- list(distrib="sur",n.points=50)
iter <- 1

## Not run: 
obj1 <- EGI(T=T,model=model.noise,method="sur",fun=testfun,iter=iter,
           lower=lower,upper=upper,optimcontrol=optimcontrol,
           integcontrol=integcontrol,new.noise.var=30*30)

obj2 <- EGI(T=T,model=model.noise,method="ranjan",fun=testfun,iter=iter,
           lower=lower,upper=upper,optimcontrol=optimcontrol,
            new.noise.var=30*30)

par(mfrow=c(1,3))
print_uncertainty_2d(model=model.noise,T=T,
main="probability of excursion, noisy obs.",
type="pn",new.points=0,cex.points=2)

print_uncertainty_2d(model=obj1$lastmodel,T=T,
main="probability of excursion, sur sampling, noisy obs.",
type="pn",new.points=iter,col.points.end="red",cex.points=2)

print_uncertainty_2d(model=obj2$lastmodel,T=T,
main="probability of excursion, ranjan sampling, noisy obs.",
type="pn",new.points=iter,col.points.end="red",cex.points=2)

## End(Not run)


##############
# Conservative estimates with non-noisy initial observations
## Not run: 
  testfun <- branin

  ## Minimize Type II error sampling

  # The list method.param contains all parameters for the
  # conservative estimate and the conservative sequential
  # strategy. Below are parameters for a type II strategy
  # with conservative estimates at 0.95
  method.param = list(penalization=0, # Type II strategy
                      typeEx=">", consLevel = 0.95,
                      n_discrete_design=500*model@d)
  # If the CE for the initial model is already computed
  # it is possible to pass the conservative Vorob'ev quantile
  # level with method.param$consVorbLevel

  obj_T2 <- EGI(T=T,model=model,method="vorobCons",
                fun=testfun,iter=iter,lower=lower,upper=upper,
                optimcontrol=optimcontrol,
                integcontrol=integcontrol,method.param=method.param)

  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model,T=T,main="probability of excursion",
                       type="pn",new.points=0,cex.points=2,consQuantile = obj_T2$allCE_lvs[1])

  print_uncertainty_2d(model=obj_T2$lastmodel,T=T,
                       main="probability of excursion, parallel Type II sampling",
                       type="pn",new.points=iter,col.points.end="red",
                       cex.points=2,consQuantile = obj_T2$allCE_lvs[2])

  ## Maximize conservative estimate's volume
  # Set up method.param
  # Here we pass the conservative level computed
  # in the previous step for the initial model
  method.param = list(typeEx=">", consLevel = 0.95,
                      n_discrete_design=500*model@d,
                      consVorbLevel=obj_T2$allCE_lvs[1]
  )

  obj_consVol <- EGI(T=T,model=model,method="vorobVol",
                     fun=testfun,iter=iter,lower=lower,upper=upper,
                     optimcontrol=optimcontrol,
                     integcontrol=integcontrol,method.param=method.param)

  par(mfrow=c(1,2))
  print_uncertainty_2d(model=model,T=T,main="probability of excursion",
                       type="pn",new.points=0,cex.points=2,consQuantile = obj_consVol$allCE_lvs[1])

  print_uncertainty_2d(model=obj_consVol$lastmodel,T=T,
                       main="probability of excursion, parallel consVol sampling",
                       type="pn",new.points=iter,col.points.end="red",
                       cex.points=2,consQuantile = obj_consVol$allCE_lvs[2])


## End(Not run)


[Package KrigInv version 1.4.2 Index]