predict.randomGLM {randomGLM} | R Documentation |
Prediction from a random generalized linear model predictor
Description
Implements a predict method on a previously-constructed random generalized linear model predictor and new data.
Usage
## S3 method for class 'randomGLM'
predict(object, newdata, type=c("response", "class"),
thresholdClassProb = object$details$thresholdClassProb, ...)
Arguments
object |
a |
newdata |
specification of test data for which to calculate the prediction. |
type |
type of prediction required. Type "response" gives the fitted probabilities for classification, the fitted values for regression. Type "class" applies only to classification, and produces the predicted class labels. |
thresholdClassProb |
the threshold of predictive probabilities to arrive at classification. Takes values between 0 and 1. Only used for binary outcomes. |
... |
other arguments that may be passed to and from methods. Currently unused. |
Details
The function calculates prediction on new test data. It only works if object
contains the regression
models that were used to construct the predictor (see argument keepModels
of the function
randomGLM
).
If the predictor was trained on a multi-class response, the prediction is applied to each of the
representing binary variables (see randomGLM
for details).
Value
For continuous prediction, the predicted values. For classification of binary response, predicted class when
type="class"
; or a two-column matrix giving the class probabilities if type="response"
.
If the predictor was trained on a multi-class response, the returned value is a matrix of "cbind"-ed results
for the representing individual binary variables (see randomGLM
for details).
Author(s)
Lin Song, Steve Horvath and Peter Langfelder.
References
Lin Song, Peter Langfelder, Steve Horvath: Random generalized linear model: a highly accurate and interpretable ensemble predictor. BMC Bioinformatics (2013)
Examples
## binary outcome prediction
# data generation
data(iris)
# Restrict data to first 100 observations
iris=iris[1:100,]
# Turn Species into a factor
iris$Species = as.factor(as.character(iris$Species))
# Select a training and a test subset of the 100 observations
set.seed(1)
indx = sample(100, 67, replace=FALSE)
xyTrain = iris[indx,]
xyTest = iris[-indx,]
xTrain = xyTrain[, -5]
yTrain = xyTrain[, 5]
xTest = xyTest[, -5]
yTest = xyTest[, 5]
# predict with a small number of bags
# - normally nBags should be at least 100.
RGLM = randomGLM(
xTrain, yTrain,
nCandidateCovariates=ncol(xTrain),
nBags=30,
keepModels = TRUE, nThreads = 1)
predicted = predict(RGLM, newdata = xTest, type="class")
table(predicted, yTest)
## continuous outcome prediction
x=matrix(rnorm(100*20),100,20)
y=rnorm(100)
xTrain = x[1:50,]
yTrain = y[1:50]
xTest = x[51:100,]
yTest = y[51:100]
RGLM = randomGLM(
xTrain, yTrain,
classify=FALSE,
nCandidateCovariates=ncol(xTrain),
nBags=10,
keepModels = TRUE, nThreads = 1)
predicted = predict(RGLM, newdata = xTest)