Kdhat {dbmss} | R Documentation |

Estimates the *Kd* function

```
Kdhat(X, r = NULL, ReferenceType, NeighborType = ReferenceType, Weighted = FALSE,
Original = TRUE, Approximate = ifelse(X$n < 10000, 0, 1), Adjust = 1,
MaxRange = "ThirdW", StartFromMinR = FALSE, CheckArguments = TRUE)
```

`X` |
A weighted, marked planar point pattern ( |

`r` |
A vector of distances. If |

`ReferenceType` |
One of the point types. If "", all points are considered (this is not the default value; |

`NeighborType` |
One of the point types. By default, the same as reference type. |

`Weighted` |
Logical; if |

`Original` |
Logical; if |

`Approximate` |
if not 0 (1 is a good choice), exact distances between pairs of points are rounded to 1024 times |

`Adjust` |
Force the automatically selected bandwidth (following |

`MaxRange` |
The maximum value of |

`StartFromMinR` |
Logical; if |

`CheckArguments` |
Logical; if |

*Kd* is a density, absolute measure of a point pattern structure. *Kd* is computed efficiently by building a matrix of distances between point pairs and calculating the density of their distribution (the default values of `r`

are those of the `density`

function). The kernel estimator is Gaussian.

The weighted *Kd* function has been named *Kemp* (*emp* is for employees) by Duranton and Overman (2005).

If `X`

is not a `Dtable`

object, the maximum value of `r`

is obtained from the geometry of the window rather than caculating the median distance between points as suggested by Duranton and Overman (2005) to save (a lot of) calculation time.

An object of class `fv`

, see `fv.object`

, which can be plotted directly using `plot.fv`

.

Estimating *Kd* relies on calculating distances, exactly or approximately (if `Approximate`

is not 0).
Then distances are smoothed by estimating their probability density.
Reflection is used to estimate density close to the lowest distance, that is the minimum observed distance (if `StartFromMinR`

is `TRUE`

) or 0: all distances below 4 times the estimation kernel bandwith apart from the lowest distance are duplicated (symmetrically with respect to the lowest distance) to avoid edge effects (underestimation of the density close to the lowest distance).

Density estimation heavily relies on the bandwith.
Starting from version 2.7, the optimal bandwith is computed from the distribution of distances between pairs of points up to twice the maximum distance considered.
The consequence is that choosing a smaller range of distances in argument `r`

results in less smoothed $Kd$ values.
The default values (`r = NULL`

, `MaxRange = "ThirdW"`

) are such that almost all the pairs of points (except those more than 2/3 of the window diameter apart) are taken into account to determine the bandwith.

Duranton, G. and Overman, H. G. (2005). Testing for Localisation Using Micro-Geographic Data. *Review of Economic Studies* 72(4): 1077-1106.

Marcon, E. and F. Puech (2017). A typology of distance-based measures of spatial concentration. *Regional Science and Urban Economics*. 62:56-67.

Scholl, T. and Brenner, T. (2015) Optimizing distance-based methods for large data sets, *Journal of Geographical Systems* 17(4): 333-351.

Sheather, S. J. and Jones, M. C. (1991) A reliable data-based bandwidth selection method for kernel density estimation. *Journal of the Royal Statistical Society series B*, 53, 683-690.

Silverman, B. W. (1986). *Density estimation for statistics and data analysis*. Chapman and Hall, London.

```
data(paracou16)
autoplot(paracou16)
# Calculate Kd
(Paracou <- Kdhat(paracou16, , "Q. Rosea", "V. Americana"))
# Plot
autoplot(Paracou)
```

[Package *dbmss* version 2.7-8 Index]