| ppm {spatstat.model} | R Documentation | 
Fit Point Process Model to Data
Description
Fits a point process model to an observed point pattern.
Usage
   ppm(Q, ...)
   ## S3 method for class 'formula'
ppm(Q, interaction=NULL, ..., data=NULL, subset)
Arguments
| Q | A  | 
| interaction | An object of class  | 
| ... | Arguments passed to  | 
| data | Optional. The values of spatial covariates (other than the Cartesian coordinates) required by the model. Either a data frame, or a list whose entries are images, functions, windows, tessellations or single numbers. See Details. | 
| subset | Optional.
An expression (which may involve the names of the
Cartesian coordinates  | 
Details
This function fits a point process model to an observed point pattern. The model may include spatial trend, interpoint interaction, and dependence on covariates.
The model fitted by ppm
is either a Poisson point process (in which different points
do not interact with each other) or a Gibbs point process (in which
different points typically inhibit each other).
For clustered point process models, use kppm.
The function ppm is generic, with methods for
the classes formula, ppp and quad.
This page describes the method for a formula.
The first argument is a formula in the R language
describing the spatial trend model to be fitted. It has the general form
pattern ~ trend where the left hand side pattern is usually
the name of a spatial point pattern (object of class "ppp")
to which the model should be fitted, or an expression which evaluates
to a point pattern;
and the right hand side trend is an expression specifying the
spatial trend of the model.
Systematic effects (spatial trend and/or dependence on 
spatial covariates) are specified by the 
trend expression on the right hand side of the formula.
The trend may involve
the Cartesian coordinates x, y,
the marks marks,
the names of entries in the argument data (if supplied),
or the names of objects that exist in the R session.
The trend formula specifies the logarithm of the
intensity of a Poisson process, or in general, the logarithm of
the first order potential of the Gibbs process.
The formula should not use any names beginning with .mpl
as these are reserved for internal use.
If the formula is pattern~1, then
the model to be fitted is stationary (or at least, its first order 
potential is constant).
The symbol . in the trend expression stands for
all the covariates supplied in the argument data.
For example the formula pattern ~ . indicates an additive
model with a main effect for each covariate in data.
Stochastic interactions between random points of the point process
are defined by the argument interaction. This is an object of
class "interact" which is initialised in a very similar way to the
usage of family objects in glm and gam.
The interaction models currently available are:
AreaInter, BadGey, Concom, DiggleGatesStibbard, DiggleGratton, Fiksel, Geyer, Hardcore, HierHard, HierStrauss, HierStraussHard, Hybrid, LennardJones, MultiHard, MultiStrauss, MultiStraussHard, OrdThresh, Ord, Pairwise, PairPiece, Penttinen, Poisson, Saturated, SatPiece, Softcore, Strauss, StraussHard and Triplets.
See the examples below.
Note that it is possible to combine several interactions
using Hybrid.
If interaction is missing or NULL,
then the model to be fitted
has no interpoint interactions, that is, it is a Poisson process
(stationary or nonstationary according to trend). In this case
the methods of maximum pseudolikelihood and maximum logistic likelihood
coincide with maximum likelihood. 
The fitted point process model returned by this function can be printed 
(by the print method print.ppm)
to inspect the fitted parameter values.
If a nonparametric spatial trend was fitted, this can be extracted using
the predict method predict.ppm.
To fit a model involving spatial covariates
other than the Cartesian coordinates x and y,
the values of the covariates should either be supplied in the
argument data, or should be stored in objects that exist
in the R session.
Note that it is not sufficient to have observed
the covariate only at the points of the data point pattern; 
the covariate must also have been observed at other 
locations in the window.
If it is given, the argument data is typically
a list, with names corresponding to variables in the trend formula.
Each entry in the list is either
- a pixel image,
- 
giving the values of a spatial covariate at a fine grid of locations. It should be an object of class "im", seeim.object.
- a function,
- 
which can be evaluated at any location (x,y)to obtain the value of the spatial covariate. It should be afunction(x, y)orfunction(x, y, ...)in the R language. For marked point pattern data, the covariate can be afunction(x, y, marks)orfunction(x, y, marks, ...). The first two arguments of the function should be the Cartesian coordinatesxandy. The function may have additional arguments; if the function does not have default values for these additional arguments, then the user must supply values for them, incovfunargs. See the Examples.
- a window,
- 
interpreted as a logical variable which is TRUEinside the window andFALSEoutside it. This should be an object of class"owin".
- a tessellation,
- 
interpreted as a factor covariate. For each spatial location, the factor value indicates which tile of the tessellation it belongs to. This should be an object of class "tess". (To make a covariate in which each tile of the tessellation has a numerical value, convert the tessellation to afunction(x,y)usingas.function.tess.)
- a single number,
- indicating a covariate that is constant in this dataset. 
The software will look up the values of each covariate at the required locations (quadrature points).
Note that, for covariate functions, only the name of the function appears in the trend formula. A covariate function is treated as if it were a single variable. The function arguments do not appear in the trend formula. See the Examples.
If data is a list,
the list entries should have names corresponding to
(some of) the names of covariates in the model formula trend.
The variable names x, y and marks
are reserved for the Cartesian 
coordinates and the mark values,
and these should not be used for variables in data.
Alternatively, data may be a data frame
giving the values of the covariates at specified locations.
Then pattern should be a quadrature scheme (object of class
"quad") giving the corresponding locations.
See ppm.quad for details.
Value
An object of class "ppm" describing a fitted point process
model.
See ppm.object for details of the format of this object
and methods available for manipulating it.
Interaction parameters
Apart from the Poisson model, every point process model fitted by
ppm has parameters that determine the strength and
range of ‘interaction’ or dependence between points.
These parameters are of two types:
- regular parameters:
- 
A parameter \phiis called regular if the log likelihood is a linear function of\thetawhere\theta = \theta(\psi)is some transformation of\psi. [Then\thetais called the canonical parameter.]
- irregular parameters
- 
Other parameters are called irregular. 
Typically, regular parameters determine the ‘strength’
of the interaction, while irregular parameters determine the
‘range’ of the interaction. For example, the Strauss process
has a regular parameter \gamma controlling the strength
of interpoint inhibition, and an irregular parameter r
determining the range of interaction.
The ppm command is only designed to estimate regular
parameters of the interaction.
It requires the values of any irregular parameters of the interaction
to be fixed. For example, to fit a Strauss process model to the cells
dataset, you could type ppm(cells ~ 1, Strauss(r=0.07)).
Note that the value of the irregular parameter r must be given.
The result of this command will be a fitted model in which the
regular parameter \gamma has been estimated.
To determine the irregular parameters, there are several
practical techniques, but no general statistical theory available.
Useful techniques include maximum profile pseudolikelihood, which
is implemented in the command profilepl,
and Newton-Raphson maximisation, implemented in the
experimental command ippm. 
Some irregular parameters can be estimated directly from data:
the hard-core radius in the model Hardcore
and the matrix of hard-core radii in MultiHard can be
estimated easily from data. In these cases, ppm allows the user
to specify the interaction without giving
the value of the irregular parameter. The user can give the
hard core interaction as interaction=Hardcore()
or even interaction=Hardcore, and 
the hard core radius will then be estimated from the data.
Technical Warnings and Error Messages
See ppm.ppp for some technical warnings about the
weaknesses of the algorithm, and explanation of some common error messages.
Author(s)
Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Rolf Turner rolfturner@posteo.net and Ege Rubak rubak@math.aau.dk
References
Baddeley, A., Coeurjolly, J.-F., Rubak, E. and Waagepetersen, R. (2014) Logistic regression for spatial Gibbs point processes. Biometrika 101 (2) 377–392.
Baddeley, A. and Turner, R. (2000) Practical maximum pseudolikelihood for spatial point patterns. Australian and New Zealand Journal of Statistics 42 283–322.
Berman, M. and Turner, T.R. (1992) Approximating point process likelihoods with GLIM. Applied Statistics 41, 31–38.
Besag, J. (1975) Statistical analysis of non-lattice data. The Statistician 24, 179-195.
Diggle, P.J., Fiksel, T., Grabarnik, P., Ogata, Y., Stoyan, D. and Tanemura, M. (1994) On parameter estimation for pairwise interaction processes. International Statistical Review 62, 99-117.
Huang, F. and Ogata, Y. (1999) Improvements of the maximum pseudo-likelihood estimators in various spatial statistical models. Journal of Computational and Graphical Statistics 8, 510–530.
Jensen, J.L. and Moeller, M. (1991) Pseudolikelihood for exponential family models of spatial point processes. Annals of Applied Probability 1, 445–461.
Jensen, J.L. and Kuensch, H.R. (1994) On asymptotic normality of pseudo likelihood estimates for pairwise interaction processes, Annals of the Institute of Statistical Mathematics 46, 475–486.
See Also
ppm.ppp and ppm.quad for
more details on the fitting technique and edge correction.
ppm.object for details of how to
print, plot and manipulate a fitted model.
ppp and quadscheme
for constructing data.
Interactions: AreaInter, BadGey, Concom, DiggleGatesStibbard, DiggleGratton, Fiksel, Geyer, Hardcore, HierHard, HierStrauss, HierStraussHard, Hybrid, LennardJones, MultiHard, MultiStrauss, MultiStraussHard, OrdThresh, Ord, Pairwise, PairPiece, Penttinen, Poisson, Saturated, SatPiece, Softcore, Strauss, StraussHard and Triplets.
See profilepl for advice on
fitting nuisance parameters in the interaction,
and ippm for irregular parameters in the trend.
See valid.ppm and project.ppm for
ensuring the fitted model is a valid point process.
See kppm for fitting Cox point process models
and cluster point process models, and dppm for fitting
determinantal point process models.
Examples
 online <- interactive()
 if(!online) {
    # reduce grid sizes for efficiency in tests
    spatstat.options(npixel=32, ndummy.min=16)
 }
 # fit the stationary Poisson process
 # to point pattern 'nztrees'
 ppm(nztrees ~ 1)
 if(online) {
   Q <- quadscheme(nztrees) 
   ppm(Q ~ 1) 
   # equivalent.
 }
 fit1 <- ppm(nztrees ~ x)
 # fit the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(a + bx)
 # where x,y are the Cartesian coordinates
 # and a,b are parameters to be estimated
 fit1
 coef(fit1)
 coef(summary(fit1))
 ppm(nztrees ~ polynom(x,2))
 # fit the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(a + bx + cx^2)
 if(online) {
   library(splines)
   ppm(nztrees ~ bs(x,df=3))
 }
 # Fits the nonstationary Poisson process 
 # with intensity function lambda(x,y) = exp(B(x))
 # where B is a B-spline with df = 3
 ppm(nztrees ~ 1, Strauss(r=10), rbord=10)
 # Fit the stationary Strauss process with interaction range r=10
 # using the border method with margin rbord=10
 ppm(nztrees ~ x, Strauss(13), correction="periodic")
 # Fit the nonstationary Strauss process with interaction range r=13
 # and exp(first order potential) =  activity = beta(x,y) = exp(a+bx)
 # using the periodic correction.
  # Compare Maximum Pseudolikelihood, Huang-Ogata and Variational Bayes fits:
  if(online) ppm(swedishpines ~ 1, Strauss(9))
  ppm(swedishpines ~ 1, Strauss(9), method="VBlogi")
  ppm(swedishpines ~ 1, Strauss(9), improve.type="ho",
      nsim=if(!online) 8 else 99)
  # Elastic net fit:
  if(require(glmnet)) {
    ppm(swedishpines ~ x+y, Strauss(9), improve.type="enet")
  }
 # COVARIATES
 #
 X <- rpoispp(20)
 weirdfunction <- function(x,y){ 10 * x^2 + 5 * sin(10 * y) }
 #
 # (a) covariate values as function
 ppm(X ~ y + weirdfunction)
 #
 # (b) covariate values in pixel image
 Zimage <- as.im(weirdfunction, unit.square())
 ppm(X ~ y + Z, covariates=list(Z=Zimage))
 #
 # (c) covariate values in data frame
 Q <- quadscheme(X)
 xQ <- x.quad(Q)
 yQ <- y.quad(Q)
 Zvalues <- weirdfunction(xQ,yQ)
 ppm(Q ~  y + Z, data=data.frame(Z=Zvalues))
 # Note Q not X
 # COVARIATE FUNCTION WITH EXTRA ARGUMENTS
 #
f <- function(x,y,a){ y - a }
ppm(X ~ x + f, covfunargs=list(a=1/2))
 # COVARIATE: logical value TRUE inside window, FALSE outside
 b <- owin(c(0.1, 0.6), c(0.1, 0.9))
 ppm(X ~ b)
 ## MULTITYPE POINT PROCESSES ### 
 # fit stationary marked Poisson process
 # with different intensity for each species
 if(online) {
   ppm(lansing ~  marks, Poisson())
 } else {
  ama <- amacrine[square(0.7)]
  a <- ppm(ama ~  marks, Poisson(), nd=16)
 }
 # fit nonstationary marked Poisson process
 # with different log-cubic trend for each species
 if(online) {
  ppm(lansing ~  marks * polynom(x,y,3), Poisson())
 } else {
  b <- ppm(ama ~  marks * polynom(x,y,2), Poisson(), nd=16)
 }