rhohat.lpp {spatstat.linnet}R Documentation

Nonparametric Estimate of Intensity as Function of a Covariate

Description

Computes a nonparametric estimate of the intensity of a point process on a linear network, as a function of a (continuous) spatial covariate.

Usage

## S3 method for class 'lpp'
rhohat(object, covariate, ...,
       weights=NULL,
       method=c("ratio", "reweight", "transform"),
       horvitz=FALSE,
       smoother=c("kernel", "local", "decreasing", "increasing",
                   "mountain", "valley", "piecewise"),
       subset=NULL,
       do.CI=TRUE, 
       jitter=TRUE, jitterfactor=1, interpolate=TRUE,    
       nd=1000, eps=NULL, random=TRUE, 
       n = 512, bw = "nrd0", adjust=1, from = NULL, to = NULL,
       bwref=bw,
       covname, confidence=0.95, positiveCI, breaks=NULL)

## S3 method for class 'lppm'
rhohat(object, covariate, ...,
       weights=NULL,
       method=c("ratio", "reweight", "transform"),
       horvitz=FALSE,
       smoother=c("kernel", "local", "decreasing", "increasing",
                  "mountain", "valley", "piecewise"),
       subset=NULL, 
       do.CI=TRUE, 
       jitter=TRUE, jitterfactor=1, interpolate=TRUE,    
       nd=1000, eps=NULL, random=TRUE, 
       n = 512, bw = "nrd0", adjust=1, from = NULL, to = NULL,
       bwref=bw,
       covname, confidence=0.95, positiveCI, breaks=NULL)

Arguments

object

A point pattern on a linear network (object of class "lpp"), or a fitted point process model on a linear network (object of class "lppm").

covariate

Either a function(x,y) or a pixel image (object of class "im" or "linim") providing the values of the covariate at any location. Alternatively one of the strings "x" or "y" signifying the Cartesian coordinates.

weights

Optional weights attached to the data points. Either a numeric vector of weights for each data point, or a pixel image (object of class "im") or a function(x,y) providing the weights.

method

Character string determining the estimation method. See Details.

horvitz

Logical value indicating whether to use Horvitz-Thompson weights. See Details.

smoother

Character string determining the smoothing algorithm and the type of curve that will be estimated. See Details.

subset

Optional. A spatial window (object of class "owin") specifying a subset of the data, from which the estimate should be calculated.

do.CI

Logical value specifying whether to calculate standard errors and confidence bands.

jitter

Logical value. If jitter=TRUE (the default), the values of the covariate at the data points will be jittered (randomly perturbed by adding a small amount of noise) using the function jitter. If jitter=FALSE, the covariate values at the data points will not be altered. See the section on Randomisation and discretisation.

jitterfactor

Numeric value controlling the scale of jittering. Passed to jitter as the argument factor.

interpolate

Logical value specifying whether to use spatial interpolation to obtain the values of the covariate at the data points, when the covariate is a pixel image (object of class "im" or "linim"). If interpolate=FALSE, the covariate value for each data point is simply the value of the covariate image at the pixel centre that is nearest to the data point. If interpolate=TRUE, the covariate value for each data point is obtained by interpolating the nearest pixel values using interp.im.

eps, nd, random

Arguments controlling the pixel resolution at which the covariate will be evaluated. See Details.

bw

Smoothing bandwidth or bandwidth rule (passed to density.default).

adjust

Smoothing bandwidth adjustment factor (passed to density.default).

n, from, to

Arguments passed to density.default to control the number and range of values at which the function will be estimated.

bwref

Optional. An alternative value of bw to use when smoothing the reference density (the density of the covariate values observed at all locations in the window).

...

Additional arguments passed to density.default or locfit::locfit.

covname

Optional. Character string to use as the name of the covariate.

confidence

Confidence level for confidence intervals. A number between 0 and 1.

positiveCI

Logical value. If TRUE, confidence limits are always positive numbers; if FALSE, the lower limit of the confidence interval may sometimes be negative. Default is FALSE if smoother="kernel" and TRUE if smoother="local". See Details.

breaks

Breakpoints for the piecewise-constant function computed when smoother='piecewise'. Either a vector of numeric values specifying the breakpoints, or a single integer specifying the number of equally-spaced breakpoints. There is a sensible default.

Details

This command estimates the relationship between point process intensity and a given spatial covariate. Such a relationship is sometimes called a resource selection function (if the points are organisms and the covariate is a descriptor of habitat) or a prospectivity index (if the points are mineral deposits and the covariate is a geological variable). This command uses nonparametric methods which do not assume a particular form for the relationship.

If object is a point pattern, and baseline is missing or null, this command assumes that object is a realisation of a point process with intensity function \lambda(u) of the form

\lambda(u) = \rho(Z(u))

where Z is the spatial covariate function given by covariate, and \rho(z) is the resource selection function or prospectivity index. A nonparametric estimator of the function \rho(z) is computed.

If object is a point pattern, and baseline is given, then the intensity function is assumed to be

\lambda(u) = \rho(Z(u)) B(u)

where B(u) is the baseline intensity at location u. A nonparametric estimator of the relative intensity \rho(z) is computed.

If object is a fitted point process model, suppose X is the original data point pattern to which the model was fitted. Then this command assumes X is a realisation of a Poisson point process with intensity function of the form

\lambda(u) = \rho(Z(u)) \kappa(u)

where \kappa(u) is the intensity of the fitted model object. A nonparametric estimator of the relative intensity \rho(z) is computed.

The nonparametric estimation procedure is controlled by the arguments smoother, method and horvitz.

The argument smoother selects the type of estimation technique.

See Baddeley (2018) for a comparison of these estimation techniques for two-dimensional point patterns.

If the argument weights is present, then the contribution from each data point X[i] to the estimate of \rho is multiplied by weights[i].

If the argument subset is present, then the calculations are performed using only the data inside this spatial region.

This technique assumes that covariate has continuous values. It is not applicable to covariates with categorical (factor) values or discrete values such as small integers.

The argument covariate should be a pixel image, or a function, or one of the strings "x" or "y" signifying the cartesian coordinates. It will be evaluated on a fine grid of locations, with spatial resolution controlled by the arguments eps,nd,random. The argument nd specifies the total number of test locations on the linear network, eps specifies the linear separation between test locations, and random specifies whether the test locations have a randomised starting position.

Value

A function value table (object of class "fv") containing the estimated values of \rho (and confidence limits) for a sequence of values of Z. Also belongs to the class "rhohat" which has special methods for print, plot and predict.

Smooth estimates

Smooth estimators of \rho(z) were proposed by Baddeley and Turner (2005) and Baddeley et al (2012). Similar estimators were proposed by Guan (2008) and in the literature on relative distributions (Handcock and Morris, 1999).

The estimated function \rho(z) will be a smooth function of z.

The smooth estimation procedure involves computing several density estimates and combining them. The algorithm used to compute density estimates is determined by smoother:

The argument method determines how the density estimates will be combined to obtain an estimate of \rho(z):

If horvitz=TRUE, then the calculations described above are modified by using Horvitz-Thompson weighting. The contribution to the numerator from each data point is weighted by the reciprocal of the baseline value or fitted intensity value at that data point; and a corresponding adjustment is made to the denominator.

If do.CI=TRUE (the default), pointwise confidence intervals for the true value of \rho(z) are also calculated for each z, and will be plotted as grey shading. The confidence intervals are derived using the central limit theorem, based on variance calculations which assume a Poisson point process. If positiveCI=FALSE, the lower limit of the confidence interval may sometimes be negative, because the confidence intervals are based on a normal approximation to the estimate of \rho(z). If positiveCI=TRUE, the confidence limits are always positive, because the confidence interval is based on a normal approximation to the estimate of \log(\rho(z)). For consistency with earlier versions, the default is positiveCI=FALSE for smoother="kernel" and positiveCI=TRUE for smoother="local".

Monotone estimates

The nonparametric maximum likelihood estimator of a monotone function \rho(z) was described by Sager (1982). This method assumes that \rho(z) is either an increasing function of z, or a decreasing function of z. The estimated function will be a step function, increasing or decreasing as a function of z.

This estimator is chosen by specifying smoother="increasing" or smoother="decreasing". The argument method is ignored this case.

To compute the estimate of \rho(z), the algorithm first computes several primitive step-function estimates, and then takes the maximum of these primitive functions.

If smoother="decreasing", each primitive step function takes the form \rho(z) = \lambda when z \le t, and \rho(z) = 0 when z > t, where and \lambda is a primitive estimate of intensity based on the data for Z \le t. The jump location t will be the value of the covariate Z at one of the data points. The primitive estimate \lambda is the average intensity (number of points divided by area) for the region of space where the covariate value is less than or equal to t.

If horvitz=TRUE, then the calculations described above are modified by using Horvitz-Thompson weighting. The contribution to the numerator from each data point is weighted by the reciprocal of the baseline value or fitted intensity value at that data point; and a corresponding adjustment is made to the denominator.

Confidence intervals are not available for the monotone estimators.

Unimodal estimators

If smoother="valley" then we estimate a U-shaped function. A function \rho(z) is U-shaped if it is decreasing when z < z_0 and increasing when z > z_0, where z_0 is called the critical value. The nonparametric maximum likelihood estimate of such a function can be computed by profiling over z_0. The algorithm considers all possible candidate values of the critical value z_0, and estimates the function \rho(z) separately on the left and right of z_0 using the monotone estimators described above. These function estimates are combined into a single function, and the Poisson point process likelihood is computed. The optimal value of z_0 is the one which maximises the Poisson point process likelihood.

If smoother="mountain" then we estimate a function which has an inverted U shape. A function \rho(z) is inverted-U-shaped if it is increasing when z < z_0 and decreasing when z > z_0. The nonparametric maximum likelihood estimate of such a function can be computed by profiling over z_0 using the same technique mutatis mutandis.

Confidence intervals are not available for the unimodal estimators.

Randomisation

By default, rhohat adds a small amount of random noise to the data. This is designed to suppress the effects of discretisation in pixel images.

This strategy means that rhohat does not produce exactly the same result when the computation is repeated. If you need the results to be exactly reproducible, set jitter=FALSE and random=FALSE.

The values of the covariate at the data points are randomly perturbed by adding a small amount of noise using the function jitter. To reduce this effect, set jitterfactor to a number smaller than 1. To suppress this effect entirely, set jitter=FALSE.

The values of the covariate along the network are sampled at a regularly-spaced grid on the network. The grid starts from a random position on each segment of the network. To suppress this behaviour, set random=FALSE.

Author(s)

Smoothing algorithm by Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Ya-Mei Chang, Yong Song, and Rolf Turner rolfturner@posteo.net.

Nonparametric maximum likelihood algorithm by Adrian Baddeley Adrian.Baddeley@curtin.edu.au.

References

Baddeley, A., Chang, Y.-M., Song, Y. and Turner, R. (2012) Nonparametric estimation of the dependence of a point process on spatial covariates. Statistics and Its Interface 5 (2), 221–236.

Baddeley, A. and Turner, R. (2005) Modelling spatial point patterns in R. In: A. Baddeley, P. Gregori, J. Mateu, R. Stoica, and D. Stoyan, editors, Case Studies in Spatial Point Pattern Modelling, Lecture Notes in Statistics number 185. Pages 23–74. Springer-Verlag, New York, 2006. ISBN: 0-387-28311-0.

Baddeley, A. (2018) A statistical commentary on mineral prospectivity analysis. Chapter 2, pages 25–65 in Handbook of Mathematical Geosciences: Fifty Years of IAMG, edited by B.S. Daya Sagar, Q. Cheng and F.P. Agterberg. Springer, Berlin.

Guan, Y. (2008) On consistent nonparametric intensity estimation for inhomogeneous spatial point processes. Journal of the American Statistical Association 103, 1238–1247.

Handcock, M.S. and Morris, M. (1999) Relative Distribution Methods in the Social Sciences. Springer, New York.

Sager, T.W. (1982) Nonparametric maximum likelihood estimation of spatial patterns. Annals of Statistics 10, 1125–1136.

See Also

rho2hat, methods.rhohat, parres.

See lppm for a parametric method for the same problem.

Examples

  Y <- runiflpp(30, simplenet)
  rhoY <- rhohat(Y, "y")

  ## do spiders prefer to be in the middle of a segment?
  teepee <- linfun(function(x,y,seg,tp){ tp }, domain(spiders))
  rhotee <- rhohat(spiders, teepee)
  rhoteeM <- rhohat(spiders, teepee, smoother="mountain")
  if(interactive()) {
    plot(rhotee, main="Spider preference for mid-segment")
    plot(rhoteeM, add=TRUE, .y ~ .x, lwd=3)
  }

[Package spatstat.linnet version 3.2-1 Index]