validann {validann} | R Documentation |
Validate Artificial Neural Networks.
Description
Compute metrics and statistics for predictive, replicative and/or structural validation of artificial neural networks (ANNs).
Usage
validann(...)
## S3 method for class 'ann'
validann(net, obs = NULL, sim = NULL, x = NULL,
na.rm = TRUE, ...)
## S3 method for class 'nnet'
validann(net, obs = NULL, sim = NULL, x = NULL,
na.rm = TRUE, ...)
## Default S3 method:
validann(obs, sim, wts = NULL, nodes = NULL,
na.rm = TRUE, ...)
Arguments
net |
an object of class ‘ann’ (as returned by function
|
obs , sim |
vectors comprising observed ( |
x |
matrix, data frame or vector of input data used for
fitting |
na.rm |
logical; should missing values (including NaN) be removed from calculations? Default = TRUE. |
wts |
vector of ANN weights used to compute input
‘relative importance’ measures if |
nodes |
vector indicating the number of nodes in each layer
of the ANN model. This vector should have 3 elements: nodes in input
layer, nodes in hidden layer (can be 0), and nodes in output layer.
If |
... |
arguments to be passed to different validann methods, see specific formulations for details. |
Details
To compute all possible validation metrics and statistics,
net
must be supplied and must be of class ‘ann’ (as returned by
ann
) or ‘nnet’ (as returned by nnet
).
However, a partial derivative (PaD) sensitivity analysis (useful for
structural validation) will only be carried out if net
is of class
‘ann’.
If obs
and sim
data are supplied in addition to net
,
validation metrics are computed based on these. Otherwise, metrics and
statistics are computed based on obs
and sim
datasets
derived from the net
object (i.e. the data used to fit net
and the fitted values). As such, both obs
and sim
must be
supplied if validation is to be based either on data not used for
training or on unprocessed training data (if training data were
preprocessed). If either obs
or sim
is specified but the
other isn't, both obs
and sim
will be derived from
net
if supplied (and a warning will be given). Similarly, this
will occur if obs
and sim
are of different lengths.
If net
is not supplied, both obs
and sim
are
required. This may be necessary if validating an ANN model not built
using either the nnet
or ann
functions.
In this case, both wts
and nodes
are also required if any
structural validation metrics are to be returned. If an ANN model has
K input nodes, J hidden nodes and a single output O,
with a bias node for both the hidden and output layers, the wts
vector must be ordered
as follows:
c(Wi1h1,Wi1h2,...Wi1hJ,Wi2h1,...Wi2hJ,...,WiKh1,...,WiKhJ,Wi0h1,...,Wi0hJ,
Wh1O,...,WhJO,Wh0O)
where Wikhj
is the weight between the kth input and the
jth hidden node and WhjO
is the weight between the
jth hidden node and the output. The bias weight on the jth
hidden layer node is labelled Wi0hj
while the bias weight on the
output is labelled Wh0O
. The wts
vector assumes the network
is fully connected; however, missing connections may be substituted by
zero weights. Skip-layer connections are not allowed.
Value
list object of class ‘validann’ with components dependent on
arguments passed to validann
function:
metrics |
a data frame consisting of metrics: AME, PDIFF, MAE, ME, RMSE, R4MS4E, AIC, BIC, NSC, RAE, PEP, MARE, MdAPE, MRE, MSRE, RVE, RSqr, IoAd, CE, PI, MSLE, MSDE, IRMSE, VE, KGE, SSE and R. See Dawson et al. (2007) for definitions. |
obs_stats |
a data frame consisting of summary statistics about the
|
sim_stats |
a data frame consisting of summary statistics about the
|
residuals |
a 1-column matrix of model residuals ( |
resid_stats |
a data frame consisting of summary statistics about the
model |
ri |
a data frame consisting of ‘relative importance’ values for each
input. Only returned if If Garson's (Garson); connection weight (CW); Profile sensitivity analysis (Profile); and partial derivative sensitivity analysis (PaD). In addition, if If See Gevrey et al. (2003), Olden et al. (2004) and Kingston et al. (2006) for details of the relative importance methods. |
y_hat |
a matrix of dimension The response values returned in |
as |
a matrix of dimension The values in |
rs |
a matrix of dimension To compute the values in |
Methods (by class)
-
ann
: Compute validation metrics whennet
is of class ‘ann’. -
nnet
: Compute validation metrics whennet
is of class ‘nnet’. -
default
: Useful for predictive validation only or when ANN model has not been developed using eitherann
ornnet
. Limited structural validation metrics may be computed and only ifwts
andnodes
are supplied.
References
Dawson, C.W., Abrahart, R.J., See, L.M., 2007. HydroTest: A web-based toolbox of evaluation metrics for the standardised assessment of hydrological forecasts. Environmental Modelling & Software, 22(7), 1034-1052. http://dx.doi.org/10.1016/j.envsoft.2006.06.008.
Olden, J.D., Joy, M.K., Death, R.G., 2004. An accurate comparison of methods for quantifying variable importance in artificial neural networks using simulated data. Ecological Modelling 178, 389-397. http://dx.doi.org/10.1016/j.ecolmodel.2004.03.013.
Gevrey, M., Dimopoulos, I., Lek, S., 2003. Review and comparison of methods to study the contribution of variables in artificial neural network models. Ecological Modelling 160, 249-264. http://dx.doi.org/10.1016/S0304-3800(02)00257-0.
Kingston, G.B., Maier, H.R., Lambert, M.F., 2006. Forecasting cyanobacteria with Bayesian and deterministic artificial neural networks, in: IJCNN '06. International Joint Conference on Neural Networks, 2006., IEEE. pp. 4870-4877. http://dx.doi.org/10.1109/ijcnn.2006.247166.
Mount, N.J., Dawson, C.W., Abrahart, R.J., 2013. Legitimising data-driven models: exemplification of a new data-driven mechanistic modelling framework. Hydrology and Earth System Sciences 17, 2827-2843. http://dx.doi.org/10.5194/hess-17-2827-2013.
See Also
ann
, plot.validann
,
predict.ann
Examples
# get validation results for 1-hidden node `ann' model fitted to ar9 data
# based on training data.
# ---
data("ar9")
samp <- sample(1:1000, 200)
y <- ar9[samp, ncol(ar9)]
x <- ar9[samp, -ncol(ar9)]
x <- x[, c(1,4,9)]
fit <- ann(x, y, size = 1, act_hid = "tanh", act_out = "linear", rang = 0.1)
results <- validann(fit, x = x)
# get validation results for above model based on a new sample of ar9 data.
# ---
samp <- sample(1:1000, 200)
y <- ar9[samp, ncol(ar9)]
x <- ar9[samp, -ncol(ar9)]
x <- x[, c(1,4,9)]
obs <- y
sim <- predict(fit, newdata = x)
results <- validann(fit, obs = obs, sim = sim, x = x)
# get validation results for `obs' and `sim' data without ANN model.
# In this example `sim' is generated using a linear model. No structural
# validation of the model is possible, but `wts' are provided to compute the
# number of model parameters needed for the calculation of certain
# goodness-of-fit metrics.
# ---
samp <- sample(1:1000, 200)
y <- ar9[samp, ncol(ar9)]
x <- ar9[samp, -ncol(ar9)]
x <- as.matrix(x[, c(1,4,9)])
lmfit <- lm.fit(x, y)
sim <- lmfit$fitted.values
obs <- y
results <- validann(obs = obs, sim = sim, wts = lmfit$coefficients)
# validann would be called in the same way if the ANN model used to generate
# `sim' was not available or was not of class `ann' or `nnet'. Ideally in
# this case, however, both `wts' and `nodes' should be supplied such that
# some structural validation metrics may be computed.
# ---
obs <- c(0.257, -0.891, -1.710, -0.575, -1.668, 0.851, -0.350, -1.313,
-2.469, 0.486)
sim <- c(-1.463, 0.027, -2.053, -1.091, -1.602, 2.018, 0.723, -0.776,
-2.351, 1.054)
wts <- c(-0.05217, 0.08363, 0.07840, -0.00753, -7.35675, -0.00066)
nodes <- c(3, 1, 1)
results <- validann(obs = obs, sim = sim, wts = wts, nodes = nodes)