Newdata {Ecfun} | R Documentation |
Create a new data.frame for predict
Description
Generate a new data.frame
or
matrix
from another with column(s)
selected by x
adopting n
values in
range(data[,x])
and all other columns
constant.
If canbeNumeric
(x) is TRUE
,
the output has x
adopting n
values in the range
(x) and all
other numeric variables at their
median
and other variables at
their most common values.
If canbeNumeric
(x) is FALSE
,
the output has x
adopting all possible
values of x
with all other variables at
the same constant values as when
canbeNumeric
(x) is TRUE
(and
n
is ignored). If x
has a
levels
attribute, the possible
values are defined by that levels
attribute. Otherwise, it is defined by
unique
(x).
This is designed to create a new
data.frame
to be used as
newdata
for predict
.
Usage
Newdata(data, x, n, na.rm=TRUE)
Arguments
data |
a |
x |
name of a column of |
n |
an Default is 2 if If If |
na.rm |
Details
1. Check data, x
.
2. If canbeNumeric
(x) is
TRUE
, let xNew
be n
values spanning range
(x). Else,
let
xNew
<- levels
(x).
3. If is.null
(xNew
), set
it to
sort
(unique
(x)).
4. let newDat <- data[rep(1, n), ]
,
and replace x
by xNew
.
5. otherVars <- colnames(data) != x
6. for(x2 in otherVars)
replace newDat[, x2]
:
If canbeNumeric
(x2) is TRUE
,
use median
(x2). Otherwise,
use its (first) most common value.
Value
A data.frame
with n
rows and columns matching those of
data
, as described above.
Author(s)
Spencer Graves
See Also
Examples
##
## 1. A reasonable test with numerics, dates,
## an ordered factor and character variables
##
xDate <- as.Date('2001-02-03')+1:4
tstDF <- data.frame(x1=1:4, xDate=xDate,
xD2=as.POSIXct(xDate),
sex=ordered(c('M', 'F', 'M', 'F')),
huh=letters[c(1:3, 3)], stringsAsFactors=FALSE)
newDat <- Newdata(tstDF, 'xDate', n=5)
# check
newD <- data.frame(x1=2.5,
xDate=xDate[1]+seq(0, 3, length=5),
xD2=as.POSIXct(xDate[2]+0.5),
sex=ordered(c('M', 'F', 'M', 'F'))[2],
huh=letters[3], stringsAsFactors=FALSE)
attr(newD, 'out.attrs') <- attr(newDat, 'out.attrs')
all.equal(newDat, newD)
##
## 2. Test with only one column
##
newDat1 <- Newdata(tstDF[, 2, drop=FALSE], 'xDate', n=5)
# check
newDat1. <- newD[, 2, drop=FALSE]
attr(newDat1., 'out.attrs') <- attr(newDat1, 'out.attrs')
all.equal(newDat1, newDat1.)
##
## 3. Test with a factor
##
newSex <- Newdata(tstDF, 'sex')
# check
newS <- with(tstDF, data.frame(
x1=2.5, xDate=xDate[1]+1.5,
xD2=as.POSIXct(xDate[1]+1.5),
sex=ordered(c('M', 'F'))[2:1],
huh=letters[3], stringsAsFactors=FALSE) )
attr(newS, 'out.attrs') <- attr(newSex, 'out.attrs')
all.equal(newSex, newS)
##
## 4. Test with an integer column number
##
newDat2 <- Newdata(tstDF, 2, n=5)
# check
all.equal(newDat2, newD)
##
## 5. Test with all
##
NewAll <- Newdata(tstDF)
# check
tstLvls <- as.list(tstDF[c(1, 4), ])
tstLvls$sex <- tstDF$sex[2:1]
tstLvls$huh <- letters[c(3, 1)]
tstLvls$stringsAsFactors <- FALSE
NewA. <- do.call(expand.grid, tstLvls)
attr(NewA., 'out.attrs') <- attr(NewAll, 'out.attrs')
all.equal(NewAll, NewA.)