sm.regression {sm}R Documentation

Nonparametric regression with one or two covariates.

Description

This function creates a nonparametric regression estimate from data consisting of a single response variable and one or two covariates. In two dimensions a perspective, image (image), contour (slice) or rgl plot of the estimated regression surface is produced. A number of other features of the construction of the estimate, and of its display, can be controlled.

If the rpanel package is available, an interactive panel can be activated to control various features of the plot.

Usage

 sm.regression(x, y, h, design.mat = NA, model = "none", weights = NA,
                 group = NA, ...)

Arguments

x

a vector, or two-column matrix, of covariate values.

y

a vector of responses.

h

a vector of length 1 or 2 giving the smoothing parameter. A normal kernel function is used and h is its standard deviation.

design.mat

the design matrix used to produce y when these are assumed to be the residuals from a linear model.

model

a character variable which defines a reference model. The settings "none", "no effect" and "linear" and all valid. Note that when a group argument is used then model should be set to "equal" or "parallel", as befits an analysis of covariance model.

weights

a vector which allows the kernel functions associated with the observations to take different weights. This is useful, in particular, when different observations have different precisions. The normal usage of this parameter is to associate observations with frequencies; if the weights are not integers, they are converted to integers, but in this case the standard errors and tests which are computed cannot be considered. This argument applies only to the case of one covariate. Use of this parameter is incompatible with binning; hence nbins must then be set to 0 or left at its default value NA.

group

a vector of groups indicators (numeric or character values) or a factor. If this argument is used then the data are passed to the sm.ancova function. See the details of the model argument above in that case.

...

other optional parameters are passed to the sm.options function, through a mechanism which limits their effect only to this call of the function; those relevant for this function are the following: display, hmult, h.weights, poly.index, band, add, ngrid, eval.points, se, se.breaks, period, xlab, ylab, zlab, hull, panel, panel.plot, lty, col, col.band, col.mesh, col.points, col.palette; see the documentation of sm.options for their description.

Details

When display is set to "persp" or "rgl", a number of graphical options are available. By setting the col parameter to "height" or "se", the surface will be painted by colours to reinforce the perception of height or indicate the relative sizes of the standard errors respectively. When model is not "none", the colour coding refers to the number of standard errors which separate the smooth regression surface and the nominated model at each position. The parameter "se.breaks", whose default value is c(-3, -2, 3, 3) can then be used to set the colour ranges. In this case, col.palette must be set to a list of colours whose length is one greater than the length of the cut-points in "se.breaks". If this is not the case, the default colour palette rev(rainbow(length(opt$se.breaks) + 1, start = 0/6, end = 4/6)).

If the argument col is not set then surface painting will be determined by the setting of se. If neither is set then colour painting will be activated by default if model != "none". (In this latter case, the argument band, retained from earlier versions for compatibility, will also be examined.)

When display is set to "rgl", some additional parameters can be used to control details of the plot. Transparency can be set by alpha, which lies between 0 and 1. When alpha is set to a vector of length two, the first component refers to the surface panels and the second to the surface mesh. Setting a component of alpha to 0 will remove the corresponding feature from the plot. col.mesh, whose valid values match those of col, controls the colour of the surface mesh. The logical parameter lit has the same meaning as in the rgl package; see material3d.

When panel is set to "TRUE", an interactive control panel is created if the rpanel package is available.

If a covariate is on a cyclical scale, this can be incorporated by setting the period argument to a vector (of length 1 or 2) whose components give the values of the periods, or NA if the covariate is not periodic.

See Chapters 3, 4 and 5 of the first reference below for the details of the construction of the estimate and its standard error. The second reference gives further details and examples of surface painting.

Value

a list containing the values of the estimate at the evaluation points, the smoothing parameter and the smoothing parameter weights. If a reference model has been specified and test set to TRUE, then the p-value of the test is also returned. When there is only one covariate, the weights associated with different observations, an estimate of the error standard deviation and the standard error of the estimate are also returned. If a reference model has been specified, this standard error refers to the comparison between the estimate and the reference model, and the values defining the reference model are also returned. If an rgl display is used, then the indices of the surface and lines used to create the display are returned.

Side Effects

a plot on the current graphical device is produced, unless the option display="none" is set.

References

Bowman, A.W. and Azzalini, A. (1997). Applied Smoothing Techniques for Data Analysis: the Kernel Approach with S-Plus Illustrations. Oxford University Press, Oxford.

Bowman, A.W. (2006). Comparing nonparametric surfaces. Statistical Modelling, 6, 279-299.

See Also

hcv, sm, sm.ancova, sm.binomial, sm.poisson, sm.regression.autocor, sm.survival, sm.options, sm.surface3d

Examples

with(trawl, {
   Zone92   <- (Year == 0 & Zone == 1)
   Position <- cbind(Longitude - 143, Latitude)
   dimnames(Position)[[2]][1] <- "Longitude - 143"

   par(mfrow = c(2, 2))
   sm.regression(Longitude, Score1, method = "aicc", col = "red",
       model = "linear")
   sm.regression(Position[Zone92, ], Score1[Zone92], display = "image", 
       theta = 120)
   sm.regression(Position[Zone92, ], Score1[Zone92], df = 12, col = "se",
       theta = 120)
   sm.regression(Position[Zone92, ], Score1[Zone92], df = 12, col = "se", 
       model = "linear", theta = 120)
   par(mfrow = c(1, 1))
})

# sm.regression(Position[Zone92, 2:1], Score1[Zone92], display = "rgl", df = 12)
# sm.regression(Position[Zone92, 2:1], Score1[Zone92], display = "rgl", df = 12,
#       alpha = c(0.9, 1), col = "se", model = "linear")

# sm.regression(Position[Zone92, 1], Score1[Zone92], panel = TRUE)
# sm.regression(Position[Zone92,  ], Score1[Zone92], panel = TRUE)
# sm.regression(Position[Zone92,  ], Score1[Zone92], panel = TRUE, display = "rgl")


[Package sm version 2.2-6.0 Index]