test.csv {stilt} | R Documentation |
To cross-validate a Gaussian Process emulator
Description
To test an emulator using cross-validation. Any reasonable number of runs can be excluded from the emulator for testing. Then, the emulator is used to predict at the excluded parameter settings, and the results can be plotted. Optionally, prediction standard deviation can also be plotted. The program is reproducible if the seed number is specified.
Usage
test.csv(final.emul, num.test, plot.std, theseed = NULL, test.runind =
NULL, make.plot = TRUE)
Arguments
final.emul |
A standard emulator object. It can be generated by, for example, the 'emulator' function. |
num.test |
Number of test runs to withhold at random from the emulator. Can not be more than half of the ensemble. |
plot.std |
If |
theseed |
Seed for random number generator. Default is |
test.runind |
If not |
make.plot |
If |
Details
The function withholds model runs from the emulator, and then uses the
withheld emulator to predict at the excluded parameter settings. Any
reasonable number of runs up to 1/2 of the ensemble can be excluded. If
make.plot=TRUE
the plot of model output and emulator prediction
at the excluded setting is produced. Setting plot.std=TRUE
adds
the prediction standard deviation to the plot. If plot.std=TRUE
but make.plot=FALSE
a warning is given and nothing is plotted.
The default setting is to
exclude num.test
runs at random. If theseed
is specified
the results become reproducible as the seed passed on to the R random
number
generator is fixed. Optionally, the excluded run indices can be
specified explicitly via the vector test.runind
. If both
theseed
and test.runind
arguments are specified, then
theseed
is ignored.
The function prints out the excluded run indices and outputs some basic guidance. The function handles runs which are at parameter space boundaries by passing NA to the relevant rows of output matrices. If prediction at a particular withheld run gives error (most likely due to the fact that extrapolation is not allowed and the excluded run is at the corner of parameter space) that particular run is not plotted. Model output is beige, and emulator output is brown.
By 'withholding' runs from an emulator it is meant that the emulator
statistical parameters are kept the same, whereas the withheld runs are
eliminated from all relevant emulator components. The ensemble size final.emul$p
is reduced
accordingly.
Value
List with components
$model.out.test |
Model output at test parameter settings [row,col] = [test run index, time index] |
$emul.out.test |
Emulator output at test parameter settings [row,col] = [test run index, time index] |
$emul.std.test |
Emulator prediction standard deviation at test parameter settings [row,col] = [test run index, time index] |
$coverage |
Empirical 95% coverage, expressed as a fraction |
The time vector for columns of all matrix components is
final.emul$t.vec
References
R. Olson and W. Chang (2013): Mathematical framework for a separable
Gaussian Process Emulator. Tech. Rep., available from
www.scrimhub.org/resources/stilt/Olson_and_Chang_2013_Stilt_Emulator_Technical_Report.pdf.
See Also
Examples
# Test the 1D emulator by withholding the second parameter
# setting, and then predicting at the withheld run
data(emul.1D)
test.1D <- test.csv(final.emul=emul.1D, num.test=1, plot.std=FALSE,
test.runind=2, make.plot=FALSE)
# Create a custom plot
plot.default(NA, xlim=range(emul.1D$t.vec), ylim=range(test.1D$model.out.test),
xlab="Year", ylab="Sample Output", main="Emulator Test at Theta=1",
cex.lab=1.3, cex.axis=1.3, cex.main=1.3)
lines(emul.1D$t.vec, test.1D$model.out.test, lwd=4, lty=2, col="green")
# The emulator prediction is close to perfect! Yay!
lines(emul.1D$t.vec, test.1D$emul.out.test, lwd=2, col="yellow")
# Produce a legend
model.max <- max(test.1D$model.out.test)
# We already know that time ranges from 0 to 10 so figuring out x placement
# for the legend is not hard
legend(1, model.max, c("Model Output", "Emulator Predictions"),
col=c("green", "yellow"), lty=c(2,1), lwd=c(4,2), cex=1.3)
# Test SICOPOLIS emulator at three pre-selected parameter settings
# specified by theseed=1234321
data(emul.Sicopolis)
# Plot the standard deviation in all three cases
test.csv(final.emul=emul.Sicopolis, num.test=3, plot.std=TRUE, theseed=1234321,
make.plot=TRUE)