gof {berryFunctions}R Documentation

GOF measures


Goodness of Fit measures (GOF) for two vectors.
gofNA: not exported, checks input for each of the functions:
rsquare: Coefficient of determination (R2)
rmse: Root Mean Square Error (for minimizing in optim)
nse: Nash-Sutcliffe efficiency, based on RHydro::eval.NSeff
kge: Kling-Gupta efficiency (better than NSE), based on hydroGOF::KGE, where there are many more options


gofNA(a, b, quiet = FALSE, fun = "")

rsquare(a, b, quiet = FALSE)

rmse(a, b, quiet = FALSE)

nse(a, b, quiet = FALSE)

kge(a, b, quiet = FALSE)



Numerical vector with observational data


Simulated data (to be compared to a)


Should NA-removal warnings be suppressed? This may be helpful within functions. DEFAULT: FALSE


Character string with function name for error and warning messages


Single numerical value


NAs are omitted with warning.


Berry Boessenkool, berry-b@gmx.de, Sept 2016

See Also

cor, lm. https://en.wikipedia.org/wiki/R-squared, https://en.wikipedia.org/wiki/Mean_squared_error


# R squared and RMSE --------------------------------------------------------
x <- rnorm(20)
y <- 2*x + rnorm(20)
legGOF <- function(a,b)
  text(a,b, paste(c("      R2","RMSE","  NSE","  KGE"), collapse="\n"), adj=1.2)
  text(a,b, paste(round(c(rsquare(x,y), rmse(x,y), nse(x,y), kge(x,y)),5),
                  collapse="\n"), adj=0)
legGOF(-1.5, 2) # R2 good, but does not check for bias (distance from 1:1 line)

abline(a=0,b=1) ; textField(-1.5,-1.5, "1:1")
abline(lm(y~x), col="red")
p <- predict(lm(y~x))
points(x, p, pch=3, col="red")
segments(x, y, x, p, col="red")
stopifnot(all.equal(  nse(y,p) , rsquare(y,x)  ))

# Input checks
is.error(   rmse(1:6, 1:8)    ,  tell=TRUE)
nse(replace(x,3,NA), y)
kge(rep(NA,20), y)
rmse(0,0, quiet=TRUE)
rsquare(1:6, tapply(chickwts$weight, chickwts$feed, mean) )

## Not run:  # time consuming Simulation

# sample size bias
x <- 1:1000
y <- x+rnorm(1000)
rmse(x,y) # 0.983
ssize <- rep(5:1000, 3)
sgofs <- sapply(ssize, function(n){i <- sample(1:1000,n); c(rsquare(x[i],y[i]),rmse(x[i],y[i]))})
plot(ssize, sgofs[2,]) # RMSE: no bias, symmetric convergence
plot(ssize, sgofs[1,]) # R2: small underestimation in small samples

if(require(pbapply)) sapply <- pbsapply
r2 <- sapply(1:10000, function(i){
   x <- rnorm(20);  y <- 2*x + rnorm(20);  c(rsquare(x,y), rmse(x,y)) })
hist(r2[1,], breaks=70, col=5,
main= "10'000 times   x <- rnorm(20);  y <- 2*x + rnorm(20);  rsquare(x,y)")
# For small samples, R^2 can by chance be far off the 'real' value!
hist(r2[2,], breaks=70, col=5, main= "... rsquare(x,y)")
# RMSE is more symmetric and gaussian

## End(Not run)

# NSE and KGE ---------------------------------------------------------------

y <- dbeta(1:40/40, 3, 10) # simulated
x <- y + rnorm(40,0,sd=0.2) # observed
lines(y, col="blue")
legGOF(25, 2)
rmse(x,y) ; rmse(y,x)
nse(x,y) ; nse(y,x)  # x=obs, y=sim  (second command is wrong)
kge(x,y) ; kge(y,x)

[Package berryFunctions version 1.20.1 Index]