R: Functions for accessing latin hypercube sampling designs from...

lhs.design {DoE.wrapper}

R Documentation

Functions for accessing latin hypercube sampling designs from package lhs or space-filling designs from package DiceDesign

Description

Functions for comfortably accessing latin hypercube sampling designs from package lhs or space-filling designs from package DiceDesign, which are useful for quantitative factors with many possible levels. In particular, they can be used in computer experiments. Most of the designs are random samples.

Usage

lhs.design(nruns, nfactors, type="optimum", factor.names=NULL, seed=NULL, digits=NULL, 
         nlevels = nruns, default.levels = c(0, 1), randomize = FALSE, ...)
lhs.augment(lhs, m=1, type="optAugment", seed=NULL, ...)

Arguments

`nruns`	number of runs in the latin hypercube sample; for type `fact` (a full factorial with equally-space levels), if `nlevels` is not separately specified, this number is taken to be the common number of levels of all factors, i.e. the resulting design will have `nruns^nfactors` runs; alternatively, if `nlevels` is separately specified as a vector of different numbers of levels, `nruns` can be missing or can be the correctly-specified number of runs.
`nfactors`	number of factors in the latin hypercube sample
`type`	character string indicating the type of design or augmentation method; defaults are “optimum” for `lhs.design` and “optAugment” for `lhs.augment`. Function `lhs.design` calls a function named typeLHS from package lhs (types `genetic`, `improved`, `maximin`, `optimum`, `random`), a function named typeDesign from package DiceDesign (types `dmax`, `strauss`, `fact`) or function `runif.faure` from package DiceDesign (type `faure`). Function `lhs.augment` calls function typeLHS from package lhs, where possible choices for `type` are `augment`, `optSeeded`, or `optAugment`. see the respective functions from packages lhs and DiceDesign.
`seed`	seed for random number generation; latin hypercube samples from package lhs are random samples. Specifying a seed used to make the result reproducible for early versions of package lhs - lately, results are reproducible within a package version, but reproducibility between package versions cannot be guaranteed.
`factor.names`	list of scale end values for each factor; names are used as variable names; the names should not be x1, x2, ..., as this would interfere with usability of standard second order analysis methods on the resulting data (`link{rsmformula}`); if the list is not named, the variable names are X1, X2 and so forth; the original unit cube calculated by package lhs (scale ends 0 and 1 for each variable) is rescaled to the values given in factor.names.
`digits`	digits to which the design columns are rounded; one single value (the same for all factors) or a vector of length `nfactors`; note that the rounding is applied after generation of the design on the actual data scale, i.e. the unit cube generated by the functions from packages lhs or DiceDesign is NOT rounded
`nlevels`	used for type `fact` only; integer number or numeric vector of `nfactor` integers; specifies the number of levels for each factor. If all factors have the same number of levels, the number of levels can also be specified through `nruns`, which is interpreted as the number of levels for type `fact`, if `nlevels` is not separately specified
`default.levels`	scale ends for all factors; convenient, if all factors have the same scaling that deviates from the default 0/1 scale ends.
`randomize`	logical that prevents randomization per default. The option has an effect for types `fact` and `faure` only. All other types are based on random design generation anyway. Note that preventing randomization is the default here, because these designs are assumed to be used mostly for computer experimentation, where the systematics of the non-randomized design may be beneficial. For hardware experimentation, randomization should be set to `TRUE`! If randomization is requested, the following information is relevant: In R version 3.6.0 and later, the default behavior of function `sample` has changed. If you work in a new (i.e., >= 3.6.0) R version want to run code interchangeably on R 3.6.0 and an earlier R version, you have to change the RNGkind setting in the later R version by `RNGkind(sample.kind="Rounding")` before running function `lhs.design`. It is recommended to change the setting back to the new recommended way afterwards: `RNGkind(sample.kind="default")` For an example, see the documentation of the example data set `VSGFS`.
`lhs`	design generated by function `lhs.design` (class `design`, of type `lhs`
`m`	integer number of additional points to add to design `lhs` (note, however, that `optSeeded` does not necessarily preserve all original runs!)
`...`	additional arguments to the functions from packages lhs or DiceDesign. Refer to their documentation. Functions for generating lhs designs: `randomLHS`, `geneticLHS`, `improvedLHS`, `maximinLHS`, `optimumLHS`, `dmaxDesign`, `straussDesign`, `runif.faure`, `factDesign`; functions for augmenting lhs designs: `augmentLHS`, `optSeededLHS`, `optAugmentLHS`)

Details

Function lhs.design creates a latin hypercube sample, function lhs.augment augments an existing latin hypercube sample (or in case of type optSeeded takes the existing sample as the starting point but potentially modifies it). In comparison to direct usage of package lhs, the functions add the possibility of recoding lhs samples to a desired range, and they embed the lhs designs into class design.
Range coding is based on the recoding facility from package rsm and the factor.names parameter used analogously to packages DoE.base and FrF2.

The lhs designs are useful for quantitative factors, if it is considered desirable to uniformly distribute design points over a hyperrectangular space. This is e.g. considered interesting for computer experiments, where replications of the same settings are often useless.

Supported design types are described in the documentation for packages lhs and DiceDesign.

Value

Both functions return a data frame of S3 class design with attributes attached. The data frame contains the experimental settings as recoded to the scale ends defined in factor.names (if given), rounded to the number of digits given in digits (if given). The experimental factors in the matrix desnum attached as attribute desnum contain the design in the unit cube (all experimental factors ranging from 0 to 1) as returned by packages lhs or DiceDesign.
Function lhs.augment preserves additional variables (e.g. responses) that have been added to the design lhs before augmenting. Note, however, that the response data are NOT used in deciding about which points to augment the design with.

The attribute run.order is not very useful for most of these designs, as there is no standard order. It therefore is present for formal reasons only and contains three identical columns of 1,2,...,nruns. For designs created with type=fact or type=faure, the standard order is the order in which package DiceDesign creates the design, and the actual run order may be different in case of randomization.
In case of lhs.augment, if the design to be augmented had been reordered before, the augmented design preserves this reorder and also the respective numbering of the design.

The attribute design.info is a list of various design properties, with type resolving to “lhs”. In addition to the standard list elements (cf. design), the subtype element indicates the type of latin hypercube designs and possibly additional augmentations, the element quantitative is a vector of nfactor logical TRUEs, and the digits elements indicates the digits to which the data were rounded.
For designs created with package DiceDesign, special list elements from this package are also added to design.info.
randomize is always TRUE for designs generated by random sampling, but may be FALSE for designs created with type=fact or type=faure.
coding provides formulae for making the designs comfortably usable with standard second order methodology implemented in package rsm. replications is always 1 and repeat.only is always FALSE; these elements are only present to fulfill the formal requirements for class design.

Warning

Since R version 3.6.0, the behavior of function sample has changed (correction of a biased previous behavior that should not be relevant for the randomization of designs). For using code that randomizes a design interchangeably between a new R version (3.6.0 or later) and an older one, please follow the steps described with the argument randomize.

Note also: Package lhs does not promise to keep designs reproducible between package versions. Thus, please make sure to store important designs for the future, if needed (of course, this is always wise anyway!).

Note

This package is still under (slow) development. Reports about bugs and inconveniences are welcome.

Author(s)

Ulrike Groemping

References

Beachkofski, B., Grandhi, R. (2002) Improved Distributed Hypercube Sampling. American Institute of Aeronautics and Astronautics Paper 1274.

Currin C., Mitchell T., Morris M. and Ylvisaker D. (1991) Bayesian Prediction of Deterministic Functions With Applications to the Design and Analysis of Computer Experiments, Journal of the American Statistical Association 86, 953–963.

Santner T.J., Williams B.J. and Notz W.I. (2003) The Design and Analysis of Computer Experiments, Springer, 121–161.

Shewry, M. C. and Wynn and H. P. (1987) Maximum entropy sampling. Journal of Applied Statistics 14, 165–170.

Fang K.-T., Li R. and Sudjianto A. (2006) Design and Modeling for Computer Experiments, Chapman & Hall.

Stein, M. (1987) Large Sample Properties of Simulations Using Latin Hypercube Sampling. Technometrics 29, 143–151.

Stocki, R. (2005) A method to improve design reliability using optimal Latin hypercube sampling. Computer Assisted Mechanics and Engineering Sciences 12, 87–105.

Examples

   ## maximin design from package lhs
   plan <- lhs.design(20,7,"maximin",digits=2) 
   plan
   plot(plan)
   cor(plan)
   y <- rnorm(20)
   r.plan <- add.response(plan, y)
   
   ## augmenting the design with 10 additional points, default method
   plan2 <- lhs.augment(plan, m=10)
   plot(plan2)
   cor(plan2)
   
   ## purely random design (usually not ideal)
   plan3 <- lhs.design(20,4,"random",
          factor.names=list(c(15,25), c(10,90), c(0,120), c(12,24)), digits=2)
   plot(plan3)
   cor(plan3)
   
   ## optimum design from package lhs (default)
   plan4 <- lhs.design(20,4,"optimum",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan4)
   cor(plan4)
   
   ## dmax design from package DiceDesign
   ## arguments range and niter_max are required
   ## ?dmaxDesign for more info
   plan5 <- lhs.design(20,4,"dmax",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2,
              range=0.2, niter_max=500)
   plot(plan5)
   cor(plan5)
   
   ## Strauss design from package DiceDesign
   ## argument RND is required
   ## ?straussDesign for more info
   plan6 <- lhs.design(20,4,"strauss",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2,
              RND = 0.2)
   plot(plan6)
   cor(plan6)
   
   ## full factorial design from package DiceDesign
   ## mini try-out version
   plan7 <- lhs.design(3,4,"fact",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan7)
   cor(plan7)
   
   ## Not run: 
   
   ## full factorial design from package DiceDesign
   ## not as many different levels as runs, but only a fixed set of levels
   ##    caution: too many levels can easily bring down the computer
   ##    above design with 7 distinct levels for each factor, 
   ##    implying 2401 runs 
   plan7 <- lhs.design(7,4,"fact",
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan7)
   cor(plan7)
   
   ## equivalent call
   plan7 <- lhs.design(,4,"fact",nlevels=7,
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   
   ## different number of levels for each factor
   plan8 <- lhs.design(,4,"fact",nlevels=c(5,6,5,7),
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   plot(plan8)
   cor(plan8)

   ## equivalent call (specifying nruns, not necessary but a good check)
   plan8 <- lhs.design(1050,4,"fact",nlevels=c(5,6,5,7),
        factor.names=list(torque=c(10,14),friction=c(25,35),
              temperature=c(-5,35),pressure=c(20,50)),digits=2)
   
## End(Not run)

[Package DoE.wrapper version 0.12 Index]