gjamPredict {gjam} | R Documentation |
Predict gjam data
Description
Predicts data from a gjam object, including conditional and out-of-sample prediction.
Usage
gjamPredict(output, newdata = NULL, y2plot = NULL, ylim = NULL,
FULL = FALSE)
Arguments
output |
object of |
newdata |
a |
y2plot |
|
ylim |
vector of lower and upper bounds for prediction plot |
FULL |
will return full chains for predictions as |
Details
If newdata
is not specified, the response is predicted from xdata
as an in-sample prediction. If newdata
is specified, prediction is either conditional or out-of-sample.
Conditional prediction on a new set of y
values is done if newdata
includes the matrix ycondData
, which holds columns to condition on. ycondData
must be a matrix
and have column names matching those in y
that it will replace. ycondData
must have at least one column, but fewer than ncol(y)
columns. Columns not included in ycondData
will be predicted conditionally. Note that conditional prediction can be erratic if the observations on which the prediction is conditioned are unlikely given the model.
Alternatively, the list newdata
can include a new version of xdata
for out-of-sample prediction. The version of xdata
passed in newdata
has the columns with the same names and variable types as xdata
passed to gjam
. Note that factor levels must also match those included when fitting the model. All columns in y
will be predicted out-of-sample.
For count composition data the effort (total count) is 1000.
Because there is no out-of-sample effort for 'CC'
data, values are predicted on the [0, 1] scale.
See examples below.
Value
x |
design matrix. |
sdList |
|
piList |
predictive intervals, only generated if |
prPresent |
|
ematrix |
effort |
ychains |
full prediction chains if |
Author(s)
James S Clark, jimclark@duke.edu
References
Clark, J.S., D. Nemergut, B. Seyednasrollah, P. Turner, and S. Zhang. 2016. Generalized joint attribute modeling for biodiversity analysis: Median-zero, multivariate, multifarious data. Ecological Monographs, 87, 34-56.
See Also
gjamSimData
simulates data
A more detailed vignette is can be obtained with:
browseVignettes('gjam')
web site 'http://sites.nicholas.duke.edu/clarklab/code/'.
Examples
## Not run:
S <- 10
f <- gjamSimData(n = 200, S = S, Q = 3, typeNames = 'CC')
ml <- list(ng = 1500, burnin = 50, typeNames = f$typeNames, holdoutN = 10)
out <- gjam(f$formula, f$xdata, f$ydata, modelList = ml)
# predict data
cols <- c('#018571', '#a6611a')
par(mfrow=c(1,3),bty='n')
gjamPredict(out, y2plot = colnames(f$ydata)) # predict the data in-sample
title('full sample')
# out-of-sample prediction
xdata <- f$xdata[1:20,]
xdata[,3] <- mean(f$xdata[,3]) # mean for x[,3]
xdata[,2] <- seq(-2,2,length=20) # gradient x[,2]
newdata <- list(xdata = xdata, nsim = 50 )
p1 <- gjamPredict( output = out, newdata = newdata)
# plus/minus 1 prediction SE, default effort = 1000
x2 <- p1$x[,2]
ylim <- c(0, max(p1$sdList$yMu[,1] + p1$sdList$yPe[,1]))
plot(x2, p1$sdList$yMu[,1],type='l',lwd=2, ylim=ylim, xlab='x2',
ylab = 'Predicted', col = cols[1])
lines(x2, p1$sdList$yMu[,1] + p1$sdList$yPe[,1], lty=2, col = cols[1])
lines(x2, p1$sdList$yMu[,1] - p1$sdList$yPe[,1], lty=2, col = cols[1])
# upper 0.95 prediction error
lines(x2, p1$piList$yLo[,1], lty=3, col = cols[1])
lines(x2, p1$piList$yHi[,1], lty=3, col = cols[1])
title('SE and prediction, Sp 1')
# conditional prediction, assume first species is absent
ydataCond <- out$inputs$y[,1,drop=FALSE]*0 # set first column to zero
newdata <- list(ydataCond = ydataCond, nsim=50)
p0 <- gjamPredict(output = out, newdata = newdata)
ydataCond <- ydataCond + 10 # first column is 10
newdata <- list(ydataCond = ydataCond, nsim=50)
p1 <- gjamPredict(output = out, newdata = newdata)
s <- 4 # species chosen at random to compare
ylim <- range( p0$sdList$yMu[,s], p1$sdList$yMu[,s] )
plot(out$inputs$y[,s],p0$sdList$yMu[,s], cex=.2, col=cols[1],
xlab = 'Observed', ylab = 'Predicted', ylim = ylim)
abline(0,1,lty=2)
points(out$inputs$y[,s],p1$sdList$yMu[,s], cex=.2, col=cols[2])
title('Cond. on 1st Sp')
legend( 'topleft', c('first species absent', 'first species = 10'),
text.col = cols, bty = 'n')
# conditional, out-of-sample
n <- 1000
S <- 10
f <- gjamSimData(n = n, S = S, Q = 3, typeNames = 'CA')
holdOuts <- sort( sample(n, 50) )
xdata <- f$xdata[-holdOuts,] # fitted data
ydata <- f$ydata[-holdOuts,]
xx <- f$xdata[holdOuts,] # use for prediction
yy <- f$ydata[holdOuts,]
ml <- list(ng = 2000, burnin = 500, typeNames = f$typeNames) # fit the non-holdouts
out <- gjam(f$formula, xdata, ydata, modelList = ml)
cdex <- sample(S, 4) # condition on 4 species
ndex <- c(1:S)[-cdex] # conditionally predict others
newdata <- list(xdata = xx, ydataCond = yy[,cdex], nsim = 200) # conditionally predict out-of-sample
p2 <- gjamPredict(output = out, newdata = newdata)
par(bty='n', mfrow=c(1,1))
plot( as.matrix(yy[,ndex]), p2$sdList$yMu[,ndex],
xlab = 'Observed', ylab = 'Predicted', cex=.3, col = cols[1])
abline(0,1,lty=2)
title('RMSPE')
mspeC <- sqrt( mean( (as.matrix(yy[,ndex]) - p2$sdList$yMu[,ndex])^2 ) )
#predict unconditionally, out-of-sample
newdata <- list(xdata = xx, nsim = 200 )
p1 <- gjamPredict(out, newdata = newdata)
points( as.matrix(yy[,ndex]), p1$sdList$yMu[,ndex], col=cols[2], cex = .3)
mspeU <- sqrt( mean( (as.matrix(yy[,ndex]) - p1$sdList$yMu[,ndex])^2 ) )
e1 <- paste( 'cond, out-of-sample =', round(mspeC, 2) )
e2 <- paste( 'uncond, out-of-sample =', round(mspeU, 2) )
legend('topleft', c(e1, e2), text.col = cols, bty = 'n')
## End(Not run)