R: Local polynomial fit.

locpoly {interp}

R Documentation

Local polynomial fit.

Description

This function performs a local polynomial fit of up to order 3 to bivariate data. It returns estimated values of the regression function as well as estimated partial derivatives up to order 3. This access to the partial derivatives was the main intent for writing this code as there already many other local polynomial regression implementations in R.

Usage

locpoly(x, y, z, xo = seq(min(x), max(x), length = nx), yo = seq(min(y),
 max(y), length = ny), nx = 40, ny = 40, input = "points", output = "grid",
 h = 0, kernel = "gaussian", solver = "QR", degree = 3, pd = "")

Arguments

`x`	vector of `x`-coordinates of data points. Missing values are not accepted.
`y`	vector of `y`-coordinates of data points. Missing values are not accepted.
`z`	vector of `z`-values at data points. Missing values are not accepted. `x`, `y`, and `z` must be the same length
`xo`	If `output="grid"` (default): sequence of `x` locations for rectangular output grid, defaults to `nx` points between `min(x)` and `max(x)`. If `output="points"`: vector of `x` locations for output points.
`yo`	If `output="grid"` (default): sequence of `y` locations for rectangular output grid, defaults to `ny` points between `min(y)` and `max(y)`. If `output="points"`: vector of `y` locations for output points. In this case it has to be same length as `xo`.
`input`	text, possible values are `"grid"` (not yet implemented) and `"points"` (default). This is used to distinguish between regular and irregular gridded data.
`output`	text, possible values are `"grid"` (=default) and `"points"`. If `"grid"` is choosen then `xo` and `yo` are interpreted as vectors spanning a rectangular grid of points `(xo[i],yo[j])`, `i=1,...,nx`, `j=1,...,ny`. This default behaviour matches how `akima::interp` works. In the case of `"points"` `xo` and `yo` have to be of same lenght and are taken as possibly irregular spaced output points `(xo[i],yo[i])`, `i=1,...,no` with `no=length(xo)`. `nx` and `ny` are ignored in this case.
`nx`	dimension of output grid in x direction
`ny`	dimension of output grid in y direction
`h`	bandwidth parameter, between 0 and 1. If a scalar is given it is interpreted as ratio applied to the dataset size to determine a local search neighbourhood, if set to 0 a minimum useful search neighbourhood is choosen (e.g. 10 points for a cubic trend function to determine all 10 parameters). If a vector of length 2 is given both components are interpreted as ratio of the `x`- and `y`-range and taken as global bandwidth.
`kernel`	Text value, implemented kernels are `uniform`, `triangle`, `epanechnikov`, `biweight`, `tricube`, `triweight`, `cosine` and `gaussian` (default).
`solver`	Text value, determines used solver in fastLM algorithm used by this code Possible values are `LLt`, `QR` (default), `SVD`, `Eigen` and `CPivQR` (compare `fastLm`).
`degree`	Integer value, degree of polynomial trend, maximum allowed value is 3.
`pd`	Text value, determines which partial derivative should be returned, possible values are `""` (default, the polynomial itself), `"x"`, `"y"`, `"xx"`, `"xy"`, `"yy"`, `"xxx"`, `"xxy"`, `"xyy"`, `"yyy"` or `"all"`.

Value

If pd="all":

`x`	`x` coordinates
`y`	`y` coordinates
`z`	estimates of `z`
`zx`	estimates of `dz/dx`
`zy`	estimates of `dz/dy`
`zxx`	estimates of `d^2z/dx^2`
`zxy`	estimates of `d^2z/dxdy`
`zyy`	estimates of `d^2z/dy^2`
`zxxx`	estimates of `d^3z/dx^3`
`zxxy`	estimates of `d^3z/dx^2dy`
`zxyy`	estimates of `d^3z/dxdy^2`
`zyyy`	estimates of `d^3z/dy^3`

If pd!="all" only the elements x, y and the desired derivative will be returned, e.g. zxy for pd="xy".

Note

Function locpoly of package KernSmooth performs a similar task for univariate data.

Author(s)

Albrecht Gebhardt <albrecht.gebhardt@aau.at>, Roger Bivand <roger.bivand@nhh.no>

References

Douglas Bates, Dirk Eddelbuettel (2013). Fast and Elegant Numerical Linear Algebra Using the RcppEigen Package. Journal of Statistical Software, 52(5), 1-24. URL http://www.jstatsoft.org/v52/i05/.

Examples


## choose a kernel
knl <- "gaussian"

## choose global and local bandwidth 
bwg <- 0.25 # *100% means: percentage of x- y-range used
bwl <- 0.1  # *100% means: percentage of data set (nearest neighbours) used

## a bivariate polynomial of degree 5:
f <- function(x,y) 0.1+ 0.2*x-0.3*y+0.1*x*y+0.3*x^2*y-0.5*y^2*x+y^3*x^2+0.1*y^5

## degree of model
dg=3 

## part 1:
## regular gridded data:
ng<- 11 # x/y size of a square data grid

## build and fill the grid with the theoretical values:

xg<-seq(0,1,length=ng)
yg<-seq(0,1,length=ng)

# xg and yg as matrix matching fg
nx <- length(xg)
ny <- length(yg)
xx <- t(matrix(rep(xg,ny),nx,ny))
yy <- matrix(rep(yg,nx),ny,nx)

fg   <- outer(xg,yg,f)

## local polynomial estimate
## global bw:
ttg <- system.time(pdg <- locpoly(xg,yg,fg,
  input="grid", pd="all", h=c(bwg,bwg), solver="QR", degree=dg, kernel=knl))
## time used:
ttg

## local bw:
ttl <- system.time(pdl <- locpoly(xg,yg,fg,
  input="grid", pd="all", h=bwl, solver="QR", degree=dg, kernel=knl))
## time used:
ttl

image(pdl$x,pdl$y,pdl$z,main="f and its estimated first partial derivatives",
      sub="colors: f, dotted: df/dx, dashed: df/dy")
contour(pdl$x,pdl$y,pdl$zx,add=TRUE,lty="dotted")
contour(pdl$x,pdl$y,pdl$zy,add=TRUE,lty="dashed")
points(xx,yy,pch=".")


## part 2:
## irregular data,
## results will not be as good as with the regular 21*21=231 points.

nd<- 121 # size of data set

## random irregular data
oldseed <- set.seed(42)
x<-runif(ng)
y<-runif(ng)
set.seed(oldseed)

z <- f(x,y)

## global bw:
ttg <- system.time(pdg <- interp::locpoly(x,y,z, xg,yg, pd="all",
  h=c(bwg,bwg), solver="QR", degree=dg,kernel=knl))

ttg

## local bw:
ttl <- system.time(pdl <- interp::locpoly(x,y,z, xg,yg, pd="all",
  h=bwl, solver="QR", degree=dg,kernel=knl))

ttl

image(pdl$x,pdl$y,pdl$z,main="f and its estimated first partial derivatives",
      sub="colors: f, dotted: df/dx, dashed: df/dy")
contour(pdl$x,pdl$y,pdl$zx,add=TRUE,lty="dotted")
contour(pdl$x,pdl$y,pdl$zy,add=TRUE,lty="dashed")
points(x,y,pch=".")

[Package interp version 1.1-6 Index]