R: Estimating the cross J-function and testing independence

NHJ {IndTestPP}

R Documentation

Estimating the cross J-function and testing independence

Description

This function estimates the cross J-function between two sets, C and D, of (homogenous or nonhomogeneous) point processes in time. It is evaluated in a grid of distances r, and it can be optionally plotted. A test to assess the independence between the sets of processes, based on the cross J-function, is also implemented.

It calls the auxiliary functions NHJaux and Jenv, not intended for users.

Usage

NHJ(lambdaC, lambdaD,T=NULL, Ptype="inhom", posC, typeC=1, posD, typeD=1, r=NULL,
L=NULL,test=FALSE,nTrans=100, rTest=NULL, conf=0.95, dplot=NULL, 
tit=c("J-function","D-function","F-function"),mfrow=NULL,cores=1,fixed.seed=NULL,...)

Arguments

`lambdaC`	A matrix of positive values. Each column is the intensity vector of one of the point processes in `C`. If there is only one process in `C`, it can be a vector or even a numeric value if the process is homogeneous.
`lambdaD`	A matrix of positive values. Each column is the intensity vector of one of the point process in `D`. If there is only one process in `D`, it can be a vector or even a numeric value if the process is homogeneous.
`T`	Numeric value. Length of the observed period. It only must be specified if the number of rows in `lambdaC` and `lambdaD` is 1.
`Ptype`	Optional. Label: "hom" or "inhom". The first one indicates that all the point processes in sets `C` and `D` are homogeneous.
`posC`	Numeric vector. Occurrence times of the points in all the point processes in `C`.
`typeC`	Numeric vector with the same length as `posC`. Code of the point process in `C` where the points in `posC` have occurred. See Details.
`posD`	Numeric vector. Occurrence times of the points in all the point processes in `D`.
`typeD`	Numeric vector with the same length as `posD`. Code of the point process in `D` where the points in `posD` have occurred.
`r`	Optional. Numeric vector. Values where J-function must be evaluated. If it is NULL, a default vector is used, see Details.
`L`	Optional. Numeric vector. Values in the observed period used to calculate the J-function. If it is NULL, a default vector is used, see Details.
`test`	Optional. Logical flag. If it is TRUE, a test of independence and a 95% envelope for the J-function are calculated.
`nTrans`	Optional. Numeric value. Only used if `test=TRUE`. Number of translations to be performed in the test and envelope calculation.
`rTest`	Optional. Numeric value. Maximum value of `r` used to calculate the independence test statistc, see Details.
`conf`	Optional. Numeric value in (0,1). Confidence level of the envelope for the J-function.
`dplot`	Optional. Label "JDF" or "J". If it is "JDF", plots of J, D and F-functions are displayed. If it is "J", only J-function is plotted.
`tit`	Optional. A vector with one or three titles to be used in the plots of J, D and F-functions.
`mfrow`	Optional. Argument to be passed to `par` for the plot of the J-function.
`cores`	Optional. Number of cores of the computer to be used in the calculations.
`fixed.seed`	An integer or NULL. If it is an integer, that is the value used to set the seed in random generation processes. It it is NULL, a random seed is used.
`...`	Further arguments to be passed to the function `plot`.

Details

The information about the processes is provided by arguments posC, the vector of all the occurrence times in the processes in C, and typeC, the vector of the code of the point process in set C where each point in posC has occurred; the second set D is characterized analogously by typeD and posD.

This function estimates the cross J-function between two sets, C and D, of (homogenous or nonhomogeneous) time point processes, see Cebrian et al (2020) for details of the estimation. The J-function measures the interpoint dependence between points in any of the processes in D, and points in any of the processes in C, adjusted for time varying intensity in the case of nonhomogenous processes. The cross J-function is defined as J_{CD}(r)=(1-D_{CD}(r))/(1-F_D(r)), if F_D(r)<1 and it is not calculated otherwise. It compares D_{CD}(r), the distribution function of the distances from a point in any of the processes in set C to the nearest point in any of the processes in set D, to F_{D}(r), the distribution function of the distances from a fixed point in the space to the nearest point in any of the processes in set D.

If argument r is NULL, the following grid is used to evaluate the function

r1<-max(20, floor(T/20))

r<-seq(1,r1,by=2)

if (length(r)>200) r<-seq(1,r1,length.out=200)

If argument L is NULL, the following grid is used

L <- seq(1, T, by = 2) if (length(L) > 5000) L <- seq(1, T, by = round((T - 1)/199))

Testing independence:

If the processes in C are independent of the processes in D given the marginal structure of the processes, the J-funtion is equal to 1, since D(r)=F(r). Hence, deviations of J(r) estimations from 1, suggest dependence betweent the two sets of processes. The test statistic is based on the mean of values |J(r)-1| evaluated in a given grid of r values.

A test based on a Lotwick-Silverman approach, see Lotwick and Silverman (1982), is implemented. This test provides a nonparametric way to test independence given the marginal intensities of the processes. Using the Lotwick-Silverman approach, not only the p-value of the test but also an envelope for the J(r) values is calculated.

In point processes, dependence often appears between close observations, and with high r values it is more difficult that the J-function is able to discriminate between dependent and independent processes. By this reason, the argument rTest allows us to fix a maximum value of r so that only J(r) estimations for r<rTest will be used to calculate the test statistic. The value rTest is drawn in the plot of the J-function as a vertical grey line.

Value

A list with elements:

`r`	Vector of values `r` where the J-function is estimated.
`NHJr`	Estimated values of `J_{CD}(r)`.
`NHDr`	Estimated values of `D_{CD}(r)`.
`NHFr`	Estimated values of `F_{D}(r)`.
`JenvL`	Lower bounds of the envelope of `J_{CD}(r)`.
`JenvU`	Upper bounds of the envelope for `J_{CD}(r)`.
`JStatOb`	Observed value of the statistic.
`JStatTr`	Sample of the values of the test statistic obtained by random translations.
`pv`	P-value of the independence test.
`T`	Length of the observed period of the process.
`L`	Grid of L values to calculate the F-funtion.

References

Cebrian, A.C., Abaurrea, J. and Asin, J. (2020). Testing independence between two point processes in time. Journal of Simulation and Computational Statistics.

Cronie, O. and van Lieshout, M.N.M. (2015). Summary statistics for inhomogeneous marked point processes. Ann Inst Stat Math.

Lotwick, H.W. and Silverman, B.W. (1982). Methods for analysing Spatial processes of several types of points. J.R. Statist. Soc. B, 44(3), pp. 406-13

Examples

set.seed(120)
lambda1<-runif(100, 0.05, 0.1)
set.seed(121)
lambda2<-runif(100, 0.01, 0.2)
pos1<-simNHPc(lambda=lambda1,fixed.seed=123)$posNH  
pos2<-simNHPc(lambda=lambda2,fixed.seed=123)$posNH

aux<-NHJ(lambdaC=lambda1, lambdaD=lambda2, posC=pos1,nTrans=50, 
	 posD=pos2, rTest=7, dplot='J', cores=1,test=TRUE)
aux$pv

#Sets with two processes
#pos3<-simNHPc(lambda=lambda1,fixed.seed=300)$posNH  
#pos4<-simNHPc(lambda=lambda2,fixed.seed=30)$posNH 
#aux<-NHJ(lambdaC=cbind(lambda1,lambda2), lambdaD=cbind(lambda1,lambda2), 
#	posC=c(pos1,pos2), typeC=c(rep(1, length(pos1)), rep(2, length(pos2))), 
#	posD=c(pos3, pos4), typeD=c(rep(1, length(pos3)), rep(2, length(pos4))), 
#	dplot='J', test=TRUE)
#aux$pv

[Package IndTestPP version 3.0 Index]