kppm {spatstat.model} | R Documentation |
Fit Cluster or Cox Point Process Model
Description
Fit a homogeneous or inhomogeneous cluster process or Cox point process model to a point pattern.
Usage
kppm(X, ...)
## S3 method for class 'formula'
kppm(X,
clusters = c("Thomas","MatClust","Cauchy","VarGamma","LGCP"),
...,
data=NULL)
## S3 method for class 'ppp'
kppm(X,
trend = ~1,
clusters = c("Thomas","MatClust","Cauchy","VarGamma","LGCP"),
data = NULL,
...,
covariates=data,
subset,
method = c("mincon", "clik2", "palm", "adapcl"),
penalised = FALSE,
improve.type = c("none", "clik1", "wclik1", "quasi"),
improve.args = list(),
weightfun=NULL,
control=list(),
stabilize=TRUE,
algorithm,
trajectory=FALSE,
statistic="K",
statargs=list(),
rmax = NULL,
epsilon=0.01,
covfunargs=NULL,
use.gam=FALSE,
nd=NULL, eps=NULL,
ppm.improve.type=c("none", "ho", "enet"),
ppm.improve.args=list())
## S3 method for class 'quad'
kppm(X,
trend = ~1,
clusters = c("Thomas","MatClust","Cauchy","VarGamma","LGCP"),
data = NULL,
...,
covariates=data,
subset,
method = c("mincon", "clik2", "palm", "adapcl"),
penalised = FALSE,
improve.type = c("none", "clik1", "wclik1", "quasi"),
improve.args = list(),
weightfun=NULL,
control=list(),
stabilize=TRUE,
algorithm,
trajectory=FALSE,
statistic="K",
statargs=list(),
rmax = NULL,
epsilon=0.01,
covfunargs=NULL,
use.gam=FALSE,
nd=NULL, eps=NULL,
ppm.improve.type=c("none", "ho", "enet"),
ppm.improve.args=list())
Arguments
X |
A point pattern dataset (object of class |
trend |
An R formula, with no left hand side, specifying the form of the log intensity. |
clusters |
Character string determining the cluster model.
Partially matched.
Options are |
data , covariates |
The values of spatial covariates (other than the Cartesian coordinates) required by the model. A named list of pixel images, functions, windows, tessellations or numeric constants. |
... |
Additional arguments. See Details. |
subset |
Optional.
A subset of the spatial domain,
to which the model-fitting should be restricted.
A window (object of class |
method |
The fitting method. Either
|
penalised |
Logical value specifying whether the objective function (the composite likelihood or contrast) should be modified by adding a penalty against extreme values of cluster scale. |
improve.type |
Method for updating the initial estimate of the trend.
Initially the trend is estimated as if the process
is an inhomogeneous Poisson process.
The default, |
improve.args |
Additional arguments passed to |
weightfun |
Optional weighting function |
control |
List of control parameters passed to the optimization function
|
stabilize |
Logical value specifying whether to numerically stabilize
the optimization algorithm, by specifying suitable default values of
|
algorithm |
Character string determining the mathematical algorithm
to be used to solve the fitting problem.
If |
trajectory |
Logical value specifying whether to save the history of all function evaluations performed by the optimization algorithm. |
statistic |
Name of the summary statistic to be used
for minimum contrast estimation: either |
statargs |
Optional list of arguments to be used when calculating
the |
rmax |
Maximum value of interpoint distance to use in the composite likelihood. |
epsilon |
Tuning parameter for the adaptive composite likelihood method. |
covfunargs , use.gam , nd , eps |
Arguments passed to |
ppm.improve.type , ppm.improve.args |
Arguments controlling the initial fit of the trend.
Passed to |
Details
This function fits a clustered point process model to the
point pattern dataset X
.
The model may be either a Neyman-Scott cluster process
or another Cox process.
The type of model is determined by the argument clusters
.
Currently the options
are clusters="Thomas"
for the Thomas process,
clusters="MatClust"
for the Matern cluster process,
clusters="Cauchy"
for the Neyman-Scott cluster process
with Cauchy kernel,
clusters="VarGamma"
for the Neyman-Scott cluster process
with Variance Gamma kernel (requires an additional argument nu
to be passed through the dots; see rVarGamma
for details),
and clusters="LGCP"
for the log-Gaussian Cox process (may
require additional arguments passed through ...
; see
rLGCP
for details on argument names).
The first four models are Neyman-Scott cluster processes.
The algorithm first estimates the intensity function
of the point process using ppm
.
The argument X
may be a point pattern
(object of class "ppp"
) or a quadrature scheme
(object of class "quad"
). The intensity is specified by
the trend
argument.
If the trend formula is ~1
(the default)
then the model is homogeneous. The algorithm begins by
estimating the intensity as the number of points divided by
the area of the window.
Otherwise, the model is inhomogeneous.
The algorithm begins by fitting a Poisson process with log intensity
of the form specified by the formula trend
.
(See ppm
for further explanation).
The argument X
may also be a formula
in the
R language. The right hand side of the formula gives the
trend
as described above. The left hand side of the formula
gives the point pattern dataset to which the model should be fitted.
If improve.type="none"
this is the final estimate of the
intensity. Otherwise, the intensity estimate is updated, as explained in
improve.kppm
. Additional arguments to
improve.kppm
are passed as a named list in
improve.args
.
The cluster parameters of the model are then fitted either by minimum contrast estimation, or by a composite likelihood method (maximum composite likelihood, maximum Palm likelihood, or by solving the adaptive composite likelihood estimating equation).
- Minimum contrast:
-
If
method = "mincon"
(the default) clustering parameters of the model will be fitted by minimum contrast estimation, that is, by matching the theoreticalK
-function of the model to the empiricalK
-function of the data, as explained inmincontrast
.For a homogeneous model (
trend = ~1
) the empiricalK
-function of the data is computed usingKest
, and the parameters of the cluster model are estimated by the method of minimum contrast.For an inhomogeneous model, the inhomogeneous
K
function is estimated byKinhom
using the fitted intensity. Then the parameters of the cluster model are estimated by the method of minimum contrast using the inhomogeneousK
function. This two-step estimation procedure is due to Waagepetersen (2007).If
statistic="pcf"
then instead of using theK
-function, the algorithm will use the pair correlation functionpcf
for homogeneous models and the inhomogeneous pair correlation functionpcfinhom
for inhomogeneous models. In this case, the smoothing parameters of the pair correlation can be controlled using the argumentstatargs
, as shown in the Examples.Additional arguments
...
will be passed toclusterfit
to control the minimum contrast fitting algorithm.The optimisation is performed by the generic optimisation algorithm
optim
. - Second order composite likelihood:
-
If
method = "clik2"
the clustering parameters of the model will be fitted by maximising the second-order composite likelihood (Guan, 2006). The log composite likelihood is\sum_{i,j} w(d_{ij}) \log\rho(d_{ij}; \theta) - \left( \sum_{i,j} w(d_{ij}) \right) \log \int_D \int_D w(\|u-v\|) \rho(\|u-v\|; \theta)\, du\, dv
where the sums are taken over all pairs of data points
x_i, x_j
separated by a distanced_{ij} = \| x_i - x_j\|
less thanrmax
, and the double integral is taken over all pairs of locationsu,v
in the spatial window of the data. Here\rho(d;\theta)
is the pair correlation function of the model with cluster parameters\theta
.The function
w
in the composite likelihood is a weighting function and may be chosen arbitrarily. It is specified by the argumentweightfun
. If this is missing orNULL
then the default is a threshold weight function,w(d) = 1(d \le R)
, whereR
isrmax/2
. If it is specified, the argumentweightfun
should be afunction
in the R language with one argument. Alternativelyweightfun
may be one of the strings"threshold"
or"taper"
representing the functionsw(d) = 1(d \le R)
andw(d) = min(1, R/d)
respectively.The optimisation is performed by the generic optimisation algorithm
optim
. - Palm likelihood:
-
If
method = "palm"
the clustering parameters of the model will be fitted by maximising the Palm loglikelihood (Tanaka et al, 2008)\sum_{i,j} w(x_i, x_j) \log \lambda_P(x_j \mid x_i; \theta) - \int_D w(x_i, u) \lambda_P(u \mid x_i; \theta) {\rm d} u
with the same notation as above. Here
\lambda_P(u|v;\theta)
is the Palm intensity of the model at locationu
given there is a point atv
.The optimisation is performed by the generic optimisation algorithm
optim
. - Adaptive Composite likelihood:
-
If
method = "cladap"
the clustering parameters of the model will be fitted by solving the adaptive second order composite likelihood estimating equation (Lavancier et al, 2021). The estimating function is\sum_{u, v} w(\epsilon \frac{| g(0; \theta) - 1 |}{g(\|u-v\|; \theta)-1}) \frac{\nabla_\theta g(\|u-v\|;\theta)}{g(\|u-v\|;\theta)} - \int_D \int_D w(\epsilon \frac{ | g(u,v; \theta) - 1|}{g(\|u-v\|; \theta)-1}) \nabla_\theta g(\|u-v\|; \theta) \rho(u) \rho(v)\, du\, dv
where the sum is taken over all distinct pairs of points. Here
g(d;\theta)
is the pair correlation function with parameters\theta
. The partial derivative with respect to\theta
isg'(d; \theta)
, and\rho(u)
denotes the fitted intensity function of the model.The tuning parameter
\epsilon
is independent of the data. It can be specified by the argumentepsilon
and has default value0.01
.The function
w
in the estimating function is a weighting function of bounded support[-1,1]
. It is specified by the argumentweightfun
. If this is missing orNULL
then the default isw(d) = 1(\|d\| \le 1) \exp(1/(r^2-1))
The estimating equation is solved using the nonlinear equation solvernleqslv
from the package nleqslv. The package nleqslv must be installed in order to use this option.
If penalised=TRUE
, the fitting procedure is modified by
adding a penalty against extreme values of the cluster scale,
as proposed by Baddeley et al (2022).
If trajectory=TRUE
, the resulting object contains the history
of all points in the cluster parameter space which were evaluated by
the optimization algorithm. The trajectory can be extracted by
traj(fit)
or traj(obsurf(fit))
where fit
is the
fitted model object.
Value
An object of class "kppm"
representing the fitted model.
There are methods for printing, plotting, predicting, simulating
and updating objects of this class.
Cluster parameters for Neyman-Scott models
For Neyman-Scott models, the fitting procedure searches for the best-fitting values of the parameters that control the intensity of parents and the physical scale of the clusters. (Any parameters that control the shape of the clusters must be specified separately and are assumed to be fixed.)
The fitted object fit
contains the fitted cluster parameters as
the element fit$par
in the format described below.
Initial estimates for these cluster
parameters can be specified using the argument startpar
in the
same format.
The cluster parameters will be stored in a named numeric vector
par
of length 2. The first value is always kappa
,
the intensity of parents (cluster centres).
The format is as follows:
-
for
clusters="Thomas"
, a vectorc(kappa, sigma2)
wheresigma2
is the square of the cluster standard deviation; -
for
clusters="MatClust"
, a vectorc(kappa, R)
whereR
is the radius of the cluster; -
for
clusters="Cauchy"
, a vectorc(kappa, eta2)
whereeta2 = code{4 * scale^2}
wherescale
is the scale parameter for the model as used inrCauchy
; -
for
clusters="VarGamma"
, a vectorc(kappa, eta)
whereeta
is equivalent to the scale parameteromega
used inrVarGamma
.
For clusters="VarGamma"
it will be necessary to specify
the shape parameter nu
as described in the help for
rVarGamma
. This is specified separately as an argument
nu
in the call to kppm
.
Optimization algorithm
The following details allow greater control over the fitting procedure.
For the first three fitting methods
(method="mincon", "clik2"
and "palm"
),
the optimisation is performed by the generic
optimisation algorithm optim
.
The behaviour of this algorithm can be controlled
by the following arguments to kppm
:
-
startpar
determines the initial estimates of the cluster parameters. -
algorithm
determines the particular optimization method. This argument is passed tooptim
as the argumentmethod
. Options are listed in the help foroptim
. The default is the Nelder-Mead simplex method. -
control
is a named list of control parameters, documented in the help foroptim
. Useful control arguments includetrace
,maxit
andabstol
. -
lower
andupper
specify bounds for the cluster parameters, whenalgorithm="L-BFGS-B"
oralgorithm="Brent"
, as described in the help foroptim
.
For method="adapcl"
, the estimating equation is solved
using the nonlinear equation solver nleqslv
from the package nleqslv.
The package nleqslv must be installed in order to use this
option.
The behaviour of this algorithm can be controlled
by the following arguments to kppm
:
-
startpar
determines the initial estimates of the cluster parameters. -
algorithm
determines the method for solving the equation. This argument is passed tonleqslv
as the argumentmethod
. Options are listed in the help fornleqslv
. -
globStrat
determines the global strategy to be applied. This argument is is passed tonleqslv
as the argumentglobal
. Options are listed in the help fornleqslv
. -
control
is a named list of control parameters, documented in the help fornleqslv
.
Log-Gaussian Cox Models
To fit a log-Gaussian Cox model,
specify clusters="LGCP"
and use additional arguments
to specify the covariance structure. These additional arguments can
be given individually in the call to kppm
, or they can be
collected together in a list called covmodel
.
For example a Matern model with parameter \nu=0.5
could be specified
either by kppm(X, clusters="LGCP", model="matern", nu=0.5)
or by
kppm(X, clusters="LGCP", covmodel=list(model="matern", nu=0.5))
.
The argument model
specifies the type of covariance
model: the default is model="exp"
for an exponential covariance.
Additional arguments specify the shape parameters of the covariance
model. For example if model="matern"
then the additional argument
nu
is required.
The available models are as follows:
model="exponential"
:-
the exponential covariance function
C(r) = \sigma^2 \exp(-r/h)
where
\sigma^2
is the (fitted) variance parameter, andh
is the (fitted) scale parameter. No shape parameters are required. model="gauss"
:-
the Gaussian covariance function
C(r) = \sigma^2 \exp(-(r/h)^2)
where
\sigma^2
is the (fitted) variance parameter, andh
is the (fitted) scale parameter. No shape parameters are required. model="stable"
:-
the stable covariance function
C(r) = \sigma^2 \exp(-(r/h)^\alpha)
where
\sigma^2
is the (fitted) variance parameter,h
is the (fitted) scale parameter, and\alpha
is the shape parameteralpha
. The parameteralpha
must be given, either as a stand-alone argument, or as an entry in the listcovmodel
. model="gencauchy"
:-
the generalised Cauchy covariance function
C(r) = \sigma^2 (1 + (x/h)^\alpha)^{-\beta/\alpha}
where
\sigma^2
is the (fitted) variance parameter,h
is the (fitted) scale parameter, and\alpha
and\beta
are the shape parametersalpha
andbeta
. The parametersalpha
andbeta
must be given, either as stand-alone arguments, or as entries in the listcovmodel
. model="matern"
:-
the Whittle-Matern covariance function
C(r) = \sigma^2 \frac{1}{2^{\nu-1} \Gamma(\nu)} (\sqrt{2 \nu} \, r/h)^\nu K_\nu(\sqrt{2\nu}\, r/h)
where
\sigma^2
is the (fitted) variance parameter,h
is the (fitted) scale parameter, and\nu
is the shape parameternu
. The parameternu
must be given, either as a stand-alone argument, or as an entry in the listcovmodel
.
Note that it is not possible to use anisotropic covariance models
because the kppm
technique assumes the pair correlation function
is isotropic.
Error and warning messages
See ppm.ppp
for a list of common error messages
and warnings originating from the first stage of model-fitting.
Author(s)
Adrian Baddeley Adrian.Baddeley@curtin.edu.au, Rolf Turner rolfturner@posteo.net and Ege Rubak rubak@math.aau.dk, with contributions from Abdollah Jalilian jalilian@razi.ac.ir and Rasmus Plenge Waagepetersen rw@math.auc.dk. Adaptive composite likelihood method contributed by Chiara Fend and modified by Adrian Baddeley. Penalised optimization developed by Adrian Baddeley, Tilman Davies Tilman.Davies@otago.ac.nz and Martin Hazelton Martin.Hazelton@otago.ac.nz.
References
Baddeley, A., Davies, T.M., Hazelton, M.L., Rakshit, S. and Turner,
R. (2022) Fundamental problems in fitting spatial cluster
process models. Spatial Statistics 52, 100709.
DOI: 10.1016/j.spasta.2022.100709
Guan, Y. (2006) A composite likelihood approach in fitting spatial point process models. Journal of the American Statistical Association 101, 1502–1512.
Guan, Y., Jalilian, A. and Waagepetersen, R. (2015) Quasi-likelihood for spatial point processes. Journal of the Royal Statistical Society, Series B 77, 677-697.
Jalilian, A., Guan, Y. and Waagepetersen, R. (2012) Decomposition of variance for spatial Cox processes. Scandinavian Journal of Statistics 40, 119–137.
Lavancier, F., Poinas, A., and Waagepetersen, R. (2021) Adaptive estimating function inference for nonstationary determinantal point processes. Scandinavian Journal of Statistics, 48 (1), 87–107.
Tanaka, U. and Ogata, Y. and Stoyan, D. (2008) Parameter estimation and model selection for Neyman-Scott point processes. Biometrical Journal 50, 43–57.
Waagepetersen, R. (2007) An estimating function approach to inference for inhomogeneous Neyman-Scott processes. Biometrics 63, 252–258.
See Also
Methods for kppm
objects:
plot.kppm
,
fitted.kppm
,
predict.kppm
,
simulate.kppm
,
update.kppm
,
vcov.kppm
,
methods.kppm
,
as.ppm.kppm
,
as.fv.kppm
,
Kmodel.kppm
,
pcfmodel.kppm
.
See also improve.kppm
for improving the fit of a
kppm
object.
Minimum contrast fitting algorithm:
higher level interface clusterfit
;
low-level algorithm mincontrast
.
Alternative fitting algorithms:
thomas.estK
,
matclust.estK
,
lgcp.estK
,
cauchy.estK
,
vargamma.estK
,
thomas.estpcf
,
matclust.estpcf
,
lgcp.estpcf
,
cauchy.estpcf
,
vargamma.estpcf
.
Summary statistics:
Kest
,
Kinhom
,
pcf
,
pcfinhom
.
For fitting Poisson or Gibbs point process models, see ppm
.
Examples
online <- interactive()
if(!online) op <- spatstat.options(npixel=32, ndummy.min=16)
# method for point patterns
kppm(redwood, ~1, "Thomas")
# method for formulas
kppm(redwood ~ 1, "Thomas")
# different models for clustering
if(online) kppm(redwood ~ x, "MatClust")
kppm(redwood ~ x, "MatClust", statistic="pcf", statargs=list(stoyan=0.2))
kppm(redwood ~ x, cluster="Cauchy", statistic="K")
kppm(redwood, cluster="VarGamma", nu = 0.5, statistic="pcf")
# log-Gaussian Cox process (LGCP) models
kppm(redwood ~ 1, "LGCP", statistic="pcf")
kppm(redwood ~ x, "LGCP", statistic="pcf",
model="matern", nu=0.3,
control=list(maxit=10))
# Different fitting techniques
fitc <- kppm(redwood ~ 1, "Thomas", method="c")
fitp <- kppm(redwood ~ 1, "Thomas", method="p")
# penalised fit
fitmp <- kppm(redwood ~ 1, "Thomas", penalised=TRUE)
# quasi-likelihood improvement
fitq <- kppm(redwood ~ x, "Thomas", improve.type = "quasi")
if(!online) spatstat.options(op)