nsinc.z {colocalization}R Documentation

Colocalization index of z-type

Description

nsinc.z is used to calculate the Pearson's correlation coefficient of the signal proportions of two channels with a z-score normalization based on complete spatial randomness (CSR) in a specified proximity of all signals or all signals of interested type as the colocalization index for a whole image. If a range of proximity sizes are concerned, the nsinc.z will take the average of the index values over the range. In the case of multiple-species data, the average of index values of all pairs at each proximity size is taken as the index for the image at that size of neighborhood.

Usage

nsinc.z(data, membership, dim = 2, r.min = NULL,
        r.max = NULL, r.count = NULL, r.adjust = NULL,
        box = NULL, edge.effect = TRUE, strata = FALSE,
        base.member = NULL, r.model = "full", ...)

Arguments

data

a data frame (or object coercible by as.data.frame to a data frame) containing at least the columns membership and x (xc, X or Xc), y (yc, Y or Yc) if dim = 2 and x (xc, X or Xc), y (yc, Y or Yc), z (zc, Z or Zc) if dim = 3.

membership

a string describing the column name in the data representing the membership of data points. There should be no less than 2 levels in the membership.

dim

an integer either = 2 or = 3. If dim = 2, the data are treated as two-dimensional; if dim = 3, the data are treated as three-dimensional.

r.min

the minimum proximity size that the user identifies as colocalization of signals. It should be numeric. If r.model = "full", the function will automatically choose the smallest inter-point distance as the r.min; if r.model = "r.med", the function will use the median inter-point distance for both r.min and r.max; if r.model = "other", the user must specify r.min, which should be no larger than r.max.

r.max

the maximum proximity size that the user identifies as colocalization of signals. It should be numeric. If r.model = "full", the function will automatically choose half of the largest inter-point distance as the r.max; if r.model = "r.med", the function will use the median inter-point distance for both r.min and r.max; if r.model = "other", the user must specify r.max, which should be between the smallest and the largest inter-point distances and no smaller than r.min.

r.count

the total count of the series of proximity sizes between r.min and r.max. If r.max = r.min or r.adjust = (r.max - r.min)/2, then r.count = 1, otherwise r.count = 30 by default or is specified by the user.

r.adjust

a very small adjustment for r.min and r.max to get the series of proximity sizes between r.min + r.adjust and r.max - r.adjust to avoid zero standard deviation of normalized proportions of signals at extremely small and large r's. The values of r.adjust depends on the choice of r.model and values of r.min and r.max. For most scenarios, it is suggested to use r.adjust = NULL and let the function choose the default value for r.adjust. In general, by default either r.adjust = 0 or r.adjust = (r.max - r.min)/(r.count + 1); otherwise it is a positive number specified by the user satisfying r.adjust \le (r.max - r.min)/2.

box

a one-row data frame describing the study region which must contain columns xmin, xmax, ymin, ymax if dim = 2 and additionally zmin, zmax if dim = 3. If box = NULL, the function will detect the smallest box containing all data points and add a buffer edge in each dimension which is equal to the median of nearest neighbor distances in that dimension. If box is specified by the user, only the data enclosed in the specified box will be considered in the analysis and signals outside the box will be ignored.

edge.effect

a logical value showing whether the edge effect should be corrected. By default it should be corrected otherwise the results are not accurate.

strata

a logical value showing whether the user wants to consider single-direction or bi-direction colocalization. By default strata = FALSE is for bi-direction colocalization. In this case, all proximity regions around all signals are considered. If strata = TRUE, then base.member must be specified or the first membership that R detects in the membership column will be used by default and only the circular regions around signals in the base membership are considered. Then, colocalization will be single-direction in this case.

base.member

one level of the memberships that is designated as the base. It works only when strata = TRUE. If strata = TRUE and no base.member is specified by the user, the first membership that R detects in the membership column will be used by default for base.member.

r.model

equals either "full", "r.med" or "other". The r.model will be used to choose the proximity size ranges that the user defines for colocalization. "full" or "r.med" can be used if the user has no specific sense of proximity sizes for colocalization. In "full" model, the colocalization proximity sizes will range from the smallest inter-point distance to half of the largest inter-point distances; in "r.med" model, the fixed proximity size is the median of inter-point distances; in "other" model, the user can define their research driven proximity sizes by specifying r.min and r.max.

...

Parameters passed to cor. The user could choose methods other than Pearson for calculating correlation.

Details

The function calculates the proportion of two types of signals normalized by a z-score under CSR in a specified r neighborhood with edge effect corrected of all signals or all base signals if strata = TRUEis specified, then obtains the Pearson correlation coefficients of each pair of channels and average them among all pairs at each r in the r series between r.min to r.max. In the case of multiple-species data, the average of index values of all pairs at each proximity size is taken as the index for the image at that size of neighborhood. The index for the whole image is named as NSInCz or NSInC of type z. The index will be close to 1 if signals are colocalized, 0 if random and -1 if dispersed. The function can deal with 2D or 3D data.

If the users have their specific proximity size, then they are encouraged to specify r.model = "other", and values of r.min and r.max.

The difference from nsinc.d is the normalization of the signal proportions. The z-type normalization has no heterogeneity under CSR caused by the edge effects related to the locations of signals. In many cases, nsinc.d and nsinc.z can give similar results. However, if the user's proximity of interest is larger than half of the largest inter-point distances, then nsinc.d is suggested.

Value

nsinc.z returns colocalization index values at each separate proximity size r, and the average colocalization index across all r's, the data that the colocalization index is calculated from, the study region, i.e., the carrying box, the original and normalized proportions of each type of signals in an r neighborhodd of all (base) signals, the r series, and some summary information:

method

"nsinc.z"

input.data.summary

a list containing the number of membership levels and the signal counts in each channel or membership of the input data.

post.data.summary

a list showing the number of membership levels and signal counts in each channel of the data after removal of signals located outside the specified box by the user. If there is no signals excluded, then post.data.summary presents the same results as input.data.summary.

r.summary

a data frame listing the r.min, r.max, r.count, r.adjust used in the calculation and the r.model specified by the user or the default. r.summary also gives the r range for the default full model, i.e., the minimum and half of the maximum of the inter-point distance of all signals, and the median value in addition.

strata

a list showing the default setting of strata or the specified strata by the user. It also presents the base membership used in the function if strata is TRUE.

edge.effect

a data frame containing a logical value indicating whether edge effect is corrected or not.

index.all

a data frame showing the colocalization index of z-type at each r.

index

the averaged colocalization index of z-type across all r's.

post.data

a data frame representing the data after removal of signals located outside the specified box by the user. If there is no signal excluded, then post.data presents the same observations as data.

study.region

the carrying box with the size of buffer width in each dimension.

P.all

the data frame showing all original and normalized proportions of each type of signals in an r-neighborhood around every (base) signal. Rows are (base) signals and columns are all memberships and r's.

r

the r series for which the colocalization indices are calculated.

Author(s)

Xueyan Liu, Jiahui Xu, Cheng Cheng, Hui Zhang.

References

Liu, X., Xu, J., Guy C., Romero E., Green D., Cheng, C., Zhang, H. (2019). Unbiased and Robust Analysis of Co-localization in Super-resolution Images. Manuscript submitted for publication.

Examples

## a simulated 2D example data.
set.seed(1234)
x <- runif(300, min = -1, max = 1)
y <- runif(300, min = -1, max = 1)
red <- data.frame(x,y, color = "red")
x <- runif(50, min = -1, max = 1)
y <- runif(50, min = -1, max = 1)
green <- data.frame(x,y, color = "green")

mydata <- rbind(red,green)
plot(mydata$x,mydata$y,col = mydata$color)

mydata.results <- nsinc.z(data = mydata, membership = "color", dim = 2,
                  r.model = "other", r.min = 0.01, r.max = 0.5, r.count = 5, r.adjust = 0)

mydata.results$index.all
mydata.results$index


## a simulated 3D example data.
data("twolines")


library("rgl")
plot3d(twolines[,c("x","y","z")], type='s', size=0.7, col = twolines$membership)
aspect3d("iso")

twolines.results <- nsinc.z(data = twolines, membership = "membership",
                            dim = 3, r.model = "full")

twolines.results$index


[Package colocalization version 1.0.2 Index]