LSCV.risk {sparr} | R Documentation |
Jointly optimal bandwidth selection for the spatial relative risk function
Description
Methods to find a jointly optimal, common case-control isotropic bandwidth for use in estimation of the fixed or adaptive kernel-smoothed relative risk function.
Usage
LSCV.risk(
f,
g = NULL,
hlim = NULL,
hseq = NULL,
type = c("fixed", "adaptive"),
method = c("kelsall-diggle", "hazelton", "davies"),
resolution = 64,
edge = TRUE,
hp = NULL,
pilot.symmetry = c("none", "f", "g", "pooled"),
auto.optim = TRUE,
seqres = 30,
parallelise = NA,
verbose = TRUE,
...
)
Arguments
f |
Either a pre-calculated object of class |
g |
As for |
hlim |
An optional vector of length 2 giving the limits of the optimisation routine with respect to the bandwidth. If unspecified, the function attempts to choose this automatically. |
hseq |
An optional increasing sequence of bandwidth values at which to
manually evaluate the optimisation criterion. Used only in the case
|
type |
A character string; |
method |
A character string controlling the selector to use. There are
three types, based on either the mean integrated squared error (MISE)
(Kelsall and Diggle, 1995; default – |
resolution |
Spatial grid size; the optimisation will be based on a
[ |
edge |
Logical value indicating whether to edge-correct the density estimates used. |
hp |
A single numeric value or a vector of length 2 giving the pilot
bandwidth(s) to be used for estimation of the pilot
densities for adaptive risk surfaces. Ignored if |
pilot.symmetry |
A character string used to control the type of
symmetry, if any, to use for the bandwidth factors when computing an
adaptive relative risk surface. See ‘Details’. Ignored if |
auto.optim |
Logical value indicating whether to automate the numerical
optimisation using |
seqres |
Optional resolution of an increasing sequence of bandwidth
values. Only used if |
parallelise |
Numeric argument to invoke parallel processing, giving
the number of CPU cores to use when |
verbose |
Logical value indicating whether to provide function progress commentary. |
... |
Additional arguments such as |
Details
Given the established preference of using a common bandwidth for both case
and control density estimates when constructing a relative
risk surface, This function calculates a ‘jointly optimal’, common isotropic
LSCV bandwidth for the (Gaussian) kernel-smoothed relative risk function
(case-control density-ratio). It can be shown that choosing a bandwidth that
is equal for both case and control density estimates is preferable to
computing ‘separately optimal’ bandwidths (Kelsall and Diggle, 1995). The user
can choose to either calculate a common smoothing parameter for a fixed-bandwidth
relative risk surface (type = "fixed"
; default), or a common global bandwidth for
an adaptive risk surface (type = "adaptive"
). See further comments below.
method = "kelsall-diggle"
: the function computes the common bandwidth which minimises the approximate mean integrated squared error (MISE) of the log-transformed risk surface (Kelsall and Diggle, 1995).method = "hazelton"
: the function minimises a weighted-by-control MISE of the (raw) relative risk function (Hazelton, 2008).method = "davies"
: the optimal bandwidth is one that minimises a crude plug-in approximation to the asymptotic MISE (Davies, 2013). Only possible fortype = "fixed"
.
For jointly optimal, common global bandwidth selection when type = "adaptive"
, the
optimisation routine utilises multiscale.density
. Like LSCV.density
,
the leave-one-out procedure does not affect the pilot density, for which additional
control is offered via the hp
and pilot.symmetry
arguments. The user has the option of
obtaining a so-called symmetric estimate (Davies et al. 2016) via
pilot.symmetry
. This amounts to choosing the same pilot density for
both case and control densities. By choosing "none"
(default), the
result uses the case and control data separately for the fixed-bandwidth
pilots, providing the original asymmetric density-ratio of Davies and
Hazelton (2010). By selecting either of "f"
, "g"
, or
"pooled"
, the pilot density is calculated based on the case, control,
or pooled case/control data respectively (using hp[1]
as the fixed
bandwidth). Davies et al. (2016) noted some beneficial practical behaviour
of the symmetric adaptive surface over the asymmetric. (The pilot bandwidth(s), if not supplied in hp
, are calculated
internally via default use of LSCV.density
, using the requested symmetric-based data set, or separately with respect to the case and control datasets f
and g
if
pilot.symmetry = "none"
.)
Value
A single numeric value of the estimated bandwidth (if
auto.optim = TRUE
). Otherwise, a list of two numeric vectors of equal
length giving the bandwidth sequence (as hs
) and corresponding CV
function value (as CV
).
Warning
The jointly optimal bandwidth selector can be
computationally expensive for large data sets and fine evaluation grid
resolutions. The user may need to experiment with adjusting hlim
to
find a suitable minimum.
Author(s)
T. M. Davies
References
Davies, T. M. (2013), Jointly optimal bandwidth selection for the planar kernel-smoothed density-ratio, Spatial and Spatio-temporal Epidemiology, 5, 51-65.
Davies, T.M. and Baddeley A. (2018), Fast computation of spatially adaptive kernel estimates, Statistics and Computing, 28(4), 937-956.
Davies, T.M. and Hazelton, M.L. (2010), Adaptive kernel estimation of spatial relative risk, Statistics in Medicine, 29(23) 2423-2437.
Davies, T.M., Jones, K. and Hazelton, M.L. (2016), Symmetric adaptive smoothing regimens for estimation of the spatial relative risk function, Computational Statistics & Data Analysis, 101, 12-28.
Hazelton, M. L. (2008), Letter to the editor: Kernel estimation of risk surfaces without the need for edge correction, Statistics in Medicine, 27, 2269-2272.
Kelsall, J.E. and Diggle, P.J. (1995), Kernel estimation of relative risk, Bernoulli, 1, 3-16.
Silverman, B.W. (1986), Density Estimation for Statistics and Data Analysis, Chapman & Hall, New York.
Wand, M.P. and Jones, C.M., 1995. Kernel Smoothing, Chapman & Hall, London.
See Also
Examples
data(pbc)
pbccas <- split(pbc)$case
pbccon <- split(pbc)$control
# FIXED (for common h)
LSCV.risk(pbccas,pbccon)
LSCV.risk(pbccas,pbccon,method="hazelton")
hcv <- LSCV.risk(pbccas,pbccon,method="davies",auto.optim=FALSE)
plot(hcv[,1],log(hcv[,2]));abline(v=hcv[which.min(hcv[,2]),1],col=2,lty=2)
# ADAPTIVE (for common h0)
LSCV.risk(pbccas,pbccon,type="adaptive")
# change pilot bandwidths used
LSCV.risk(pbccas,pbccon,type="adaptive",hp=c(OS(pbccas)/2,OS(pbccon)/2))
# specify pooled-data symmetric relative risk estimator
LSCV.risk(pbccas,pbccon,type="adaptive",hp=OS(pbc),pilot.symmetry="pooled")
# as above, for Hazelton selector
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton")
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton",hp=c(OS(pbccas)/2,OS(pbccon)/2))
LSCV.risk(pbccas,pbccon,type="adaptive",method="hazelton",hp=OS(pbc),pilot.symmetry="pooled")