validate.rpart {rms} | R Documentation |
Dxy and Mean Squared Error by Cross-validating a Tree Sequence
Description
Uses xval
-fold cross-validation of a sequence of trees to derive
estimates of the mean squared error and Somers' Dxy
rank correlation
between predicted and observed responses. In the case of a binary response
variable, the mean squared error is the Brier accuracy score. For
survival trees, Dxy
is negated so that larger is better.
There are print
and plot
methods for
objects created by validate.rpart
.
Usage
# f <- rpart(formula=y ~ x1 + x2 + \dots) # or rpart
## S3 method for class 'rpart'
validate(fit, method, B, bw, rule, type, sls, aics,
force, estimates, pr=TRUE,
k, rand, xval=10, FUN, ...)
## S3 method for class 'validate.rpart'
print(x, ...)
## S3 method for class 'validate.rpart'
plot(x, what=c("mse","dxy"), legendloc=locator, ...)
Arguments
fit |
an object created by |
method , B , bw , rule , type , sls , aics , force , estimates |
are there only for consistency with the generic |
x |
the result of |
k |
a sequence of cost/complexity values. By default these are obtained
from calling |
rand |
a random sample (usually omitted) |
xval |
number of splits |
FUN |
the name of a function which produces a sequence of trees, such
|
... |
additional arguments to |
pr |
set to |
what |
a vector of things to plot. By default, 2 plots will be done, one for
|
legendloc |
a function that is evaluated with a single argument equal to |
Value
a list of class "validate.rpart"
with components named k, size, dxy.app
,
dxy.val, mse.app, mse.val, binary, xval
. size
is the number of nodes,
dxy
refers to Somers' D
, mse
refers to mean squared error of prediction,
app
means apparent accuracy on training samples, val
means validated
accuracy on test samples, binary
is a logical variable indicating whether
or not the response variable was binary (a logical or 0/1 variable is
binary). size
will not be present if the user specifies k
.
Side Effects
prints if pr=TRUE
Author(s)
Frank Harrell
Department of Biostatistics
Vanderbilt University
fh@fharrell.com
See Also
rpart
, somers2
,
dxy.cens
, locator
,
legend
Examples
## Not run:
n <- 100
set.seed(1)
x1 <- runif(n)
x2 <- runif(n)
x3 <- runif(n)
y <- 1*(x1+x2+rnorm(n) > 1)
table(y)
require(rpart)
f <- rpart(y ~ x1 + x2 + x3, model=TRUE)
v <- validate(f)
v # note the poor validation
par(mfrow=c(1,2))
plot(v, legendloc=c(.2,.5))
par(mfrow=c(1,1))
## End(Not run)