fRegress {fda} | R Documentation |
Functional Regression Analysis
Description
This function carries out a functional regression analysis, where
either the dependent variable or one or more independent variables are
functional. Non-functional variables may be used on either side
of the equation. In a simple problem where there is a single scalar
independent covariate with values z_i, i=1,\ldots,N
and a single
functional covariate with values x_i(t)
, the two versions of the
model fit by fRegress
are the scalar dependent variable
model
y_i = \beta_1 z_i + \int x_i(t) \beta_2(t) \, dt + e_i
and the concurrent functional dependent variable model
y_i(t) = \beta_1(t) z_i + \beta_2(t) x_i(t) + e_i(t).
In these models, the final term e_i
or e_i(t)
is a
residual, lack of fit or error term.
In the concurrent functional linear model for a functional dependent
variable, all functional variables are all evaluated at a common
time or argument value t
. That is, the fit is defined in terms of
the behavior of all variables at a fixed time, or in terms of "now"
behavior.
All regression coefficient functions \beta_j(t)
are considered
to be functional. In the case of a scalar dependent variable, the
regression coefficient for a scalar covariate is converted to a
functional variable with a constant basis. All regression
coefficient functions can be forced to be smooth through the
use of roughness penalties, and consequently are specified in the
argument list as functional parameter objects.
Usage
fRegress(y, ...)
## S3 method for class 'fd'
fRegress(y, xfdlist, betalist, wt=NULL,
y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE,
method=c('fRegress', 'model'), sep='.', ...)
## S3 method for class 'double'
fRegress(y, xfdlist, betalist, wt=NULL,
y2cMap=NULL, SigmaE=NULL, returnMatrix=FALSE, ...)
## S3 method for class 'formula'
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
y2cMap=NULL, SigmaE=NULL,
method='fRegress', sep='.', ...)
## S3 method for class 'character'
fRegress(y, data=NULL, betalist=NULL, wt=NULL,
y2cMap=NULL, SigmaE=NULL,
method='fRegress', sep='.', ...)
Arguments
y |
the dependent variable object. It may be an object of five possible classes or attributes:
|
data |
an optional |
xfdlist |
a list of length equal to the number of independent variables (including any intercept). Members of this list are the independent variables. They can be objects of either of these two classes:
In either case, the object must have the same number of replications
as the dependent variable object. That is, if it is a scalar, it
must be of the same length as the dependent variable, and if it is
functional, it must have the same number of replications as the
dependent variable. (Only univariate independent variables are
currently allowed in |
betalist |
For the For the |
wt |
weights for weighted least squares |
y2cMap |
the matrix mapping from the vector of observed values to the
coefficients for the dependent variable. This is output by function
|
SigmaE |
Estimate of the covariances among the residuals. This can only be
estimated after a preliminary analysis with |
method |
a character string matching either |
sep |
separator for creating names for multiple variables for
|
returnMatrix |
logical: If TRUE, a two-dimensional is returned using a special class from the Matrix package. |
... |
optional arguments |
Details
Alternative forms of functional regression can be categorized with traditional least squares using the following 2 x 2 table:
explanatory | variable | |||
response | | | scalar | | | function |
| | | | |||
scalar | | | lm | | | fRegress.numeric |
| | | | |||
function | | | fRegress.fd or | | | fRegress.fd or |
| | fRegress.fdPar | | | fRegress.fdPar or linmod | |
For fRegress.numeric
, the numeric response is assumed to be the
sum of integrals of xfd * beta for all functional xfd terms.
fRegress.fd or .fdPar
produces a concurrent regression with
each beta
being also a (univariate) function.
linmod
predicts a functional response from a convolution
integral, estimating a bivariate regression function.
In the computation of regression function estimates in
fRegress
, all independent variables are treated as if they are
functional. If argument xfdlist
contains one or more vectors,
these are converted to functional data objects having the constant
basis with coefficients equal to the elements of the vector.
Needless to say, if all the variables in the model are scalar, do NOT
use this function. Instead, use either lm
or lsfit
.
These functions provide a partial implementation of Ramsay and Silverman (2005, chapters 12-20).
Value
These functions return either a standard fRegress
fit object or
or a model specification:
The fRegress fit object case: |
A list of class
If
If
|
The model specification object case: |
The
|
Author(s)
J. O. Ramsay, Giles Hooker, and Spencer Graves
References
Ramsay, James O., Hooker, Giles, and Graves, Spencer (2009), Functional data analysis with R and Matlab, Springer, New York.
Ramsay, James O., and Silverman, Bernard W. (2005), Functional Data Analysis, 2nd ed., Springer, New York.
Ramsay, James O., and Silverman, Bernard W. (2002), Applied Functional Data Analysis, Springer, New York.
See Also
fRegress.stderr
,
fRegress.CV
,
Fperm.fd
,
Fstat.fd
,
linmod
Examples
oldpar <- par(no.readonly=TRUE)
###
###
### vector response with functional explanatory variable
###
###
# data are in Canadian Weather object
# print the names of the data
print(names(CanadianWeather))
# set up log10 of annual precipitation for 35 weather stations
annualprec <-
log10(apply(CanadianWeather$dailyAv[,,"Precipitation.mm"], 2,sum))
# The simplest 'fRegress' call is singular with more bases
# than observations, so we use only 25 basis functions, for this example
smallbasis <- create.fourier.basis(c(0, 365), 25)
# The covariate is the temperature curve for each station.
tempfd <-
smooth.basis(day.5, CanadianWeather$dailyAv[,,"Temperature.C"], smallbasis)$fd
##
## formula interface: specify the model by a formula, the method
## fRegress.formula automatically sets up the regression coefficient functions,
## a constant function for the intercept,
## and a higher dimensional function
## for the inner product with temperature
##
precip.Temp1 <- fRegress(annualprec ~ tempfd, method="fRegress")
# the output is a list with class name fRegress, display names
names(precip.Temp1)
#[c1] "yvec" "xfdlist" "betalist" "betaestlist" "yhatfdobj"
# [6] "Cmat" "Dmat" "Cmatinv" "wt" "df"
#[11] "GCV" "OCV" "y2cMap" "SigmaE" "betastderrlist"
#[16] "bvar" "c2bMap"
# the vector of fits to the data is object precip.Temp1$yfdPar,
# but since the dependent variable is a vector, so is the fit
annualprec.fit1 <- precip.Temp1$yhatfdobj
# plot the data and the fit
plot(annualprec.fit1, annualprec, type="p", pch="o")
lines(annualprec.fit1, annualprec.fit1, lty=2)
# print root mean squared error
RMSE <- round(sqrt(mean((annualprec-annualprec.fit1)^2)),3)
print(paste("RMSE =",RMSE))
# plot the estimated regression function
plot(precip.Temp1$betaestlist[[2]])
# This isn't helpful either, the coefficient function is too
# complicated to interpret.
# display the number of basis functions used:
print(precip.Temp1$betaestlist[[2]]$fd$basis$nbasis)
# 25 basis functions to fit 35 values, no wonder we over-fit the data
##
## Get the default setup and modify it
## the "model" value of the method argument causes the analysis
## to produce a list vector of arguments for calling the
## fRegress function
##
precip.Temp.mdl1 <- fRegress(annualprec ~ tempfd, method="model")
# First confirm we get the same answer as above by calling
# function fRegress() with these arguments:
precip.Temp.m <- do.call('fRegress', precip.Temp.mdl1)
all.equal(precip.Temp.m, precip.Temp1)
# set up a smaller basis for beta2 than for temperature so that we
# get a more parsimonious fit to the data
nbetabasis2 <- 21 # not much less, but we add some roughness penalization
betabasis2 <- create.fourier.basis(c(0, 365), nbetabasis2)
betafd2 <- fd(rep(0, nbetabasis2), betabasis2)
# add smoothing
betafdPar2 <- fdPar(betafd2, lambda=10)
# replace the regress coefficient function with this fdPar object
precip.Temp.mdl2 <- precip.Temp.mdl1
precip.Temp.mdl2[['betalist']][['tempfd']] <- betafdPar2
# Now do re-fit the data
precip.Temp2 <- do.call('fRegress', precip.Temp.mdl2)
# Compare the two fits:
# degrees of freedom
precip.Temp1[['df']] # 26
precip.Temp2[['df']] # 22
# root-mean-squared errors:
RMSE1 <- round(sqrt(mean(with(precip.Temp1, (yhatfdobj-yvec)^2))),3)
RMSE2 <- round(sqrt(mean(with(precip.Temp2, (yhatfdobj-yvec)^2))),3)
print(c(RMSE1, RMSE2))
# display further results for the more parsimonious model
annualprec.fit2 <- precip.Temp2$yhatfdobj
plot(annualprec.fit2, annualprec, type="p", pch="o")
lines(annualprec.fit2, annualprec.fit2, lty=2)
# plot the estimated regression function
plot(precip.Temp2$betaestlist[[2]])
# now we see that it is primarily the temperatures in the
# early winter that provide the fit to log precipitation by temperature
##
## Manual construction of xfdlist and betalist
##
xfdlist <- list(const=rep(1, 35), tempfd=tempfd)
# The intercept must be constant for a scalar response
betabasis1 <- create.constant.basis(c(0, 365))
betafd1 <- fd(0, betabasis1)
betafdPar1 <- fdPar(betafd1)
betafd2 <- fd(matrix(0,7,1), create.bspline.basis(c(0, 365),7))
# convert to an fdPar object
betafdPar2 <- fdPar(betafd2)
betalist <- list(const=betafdPar1, tempfd=betafdPar2)
precip.Temp3 <- fRegress(annualprec, xfdlist, betalist)
annualprec.fit3 <- precip.Temp3$yhatfdobj
# plot the data and the fit
plot(annualprec.fit3, annualprec, type="p", pch="o")
lines(annualprec.fit3, annualprec.fit3)
plot(precip.Temp3$betaestlist[[2]])
###
###
### functional response with vector explanatory variables
###
###
##
## simplest: formula interface
##
daybasis65 <- create.fourier.basis(rangeval=c(0, 365), nbasis=65,
axes=list('axesIntervals'))
Temp.fd <- with(CanadianWeather, smooth.basisPar(day.5,
dailyAv[,,'Temperature.C'], daybasis65)$fd)
TempRgn.f <- fRegress(Temp.fd ~ region, CanadianWeather)
##
## Get the default setup and possibly modify it
##
TempRgn.mdl <- fRegress(Temp.fd ~ region, CanadianWeather, method='model')
# make desired modifications here
# then run
TempRgn.m <- do.call('fRegress', TempRgn.mdl)
# no change, so match the first run
all.equal(TempRgn.m, TempRgn.f)
##
## More detailed set up
##
region.contrasts <- model.matrix(~factor(CanadianWeather$region))
rgnContr3 <- region.contrasts
dim(rgnContr3) <- c(1, 35, 4)
dimnames(rgnContr3) <- list('', CanadianWeather$place, c('const',
paste('region', c('Atlantic', 'Continental', 'Pacific'), sep='.')) )
const365 <- create.constant.basis(c(0, 365))
region.fd.Atlantic <- fd(matrix(rgnContr3[,,2], 1), const365)
# str(region.fd.Atlantic)
region.fd.Continental <- fd(matrix(rgnContr3[,,3], 1), const365)
region.fd.Pacific <- fd(matrix(rgnContr3[,,4], 1), const365)
region.fdlist <- list(const=rep(1, 35),
region.Atlantic=region.fd.Atlantic,
region.Continental=region.fd.Continental,
region.Pacific=region.fd.Pacific)
# str(TempRgn.mdl$betalist)
###
###
### functional response with functional explanatory variable
###
###
##
## predict knee angle from hip angle;
## from demo('gait', package='fda')
##
## formula interface
##
gaittime <- as.matrix((1:20)/21)
gaitrange <- c(0,20)
gaitbasis <- create.fourier.basis(gaitrange, nbasis=21)
gaitnbasis <- gaitbasis$nbasis
gaitcoef <- matrix(0,gaitnbasis,dim(gait)[2])
harmaccelLfd <- vec2Lfd(c(0, (2*pi/20)^2, 0), rangeval=gaitrange)
gaitfd <- smooth.basisPar(gaittime, gait, gaitbasis,
Lfdobj=harmaccelLfd, lambda=1e-2)$fd
hipfd <- gaitfd[,1]
kneefd <- gaitfd[,2]
knee.hip.f <- fRegress(kneefd ~ hipfd)
##
## manual set-up
##
# set up the list of covariate objects
const <- rep(1, dim(kneefd$coef)[2])
xfdlist <- list(const=const, hipfd=hipfd)
beta0 <- with(kneefd, fd(gaitcoef, gaitbasis, fdnames))
beta1 <- with(hipfd, fd(gaitcoef, gaitbasis, fdnames))
betalist <- list(const=fdPar(beta0), hipfd=fdPar(beta1))
fRegressout <- fRegress(kneefd, xfdlist, betalist)
par(oldpar)