R: Group Points

group_pts {spatsoc}

R Documentation

Group Points

Description

group_pts groups rows into spatial groups. The function accepts a data.table with relocation data, individual identifiers and a threshold argument. The threshold argument is used to specify the criteria for distance between points which defines a group. Relocation data should be in two columns representing the X and Y coordinates.

Usage

group_pts(
  DT = NULL,
  threshold = NULL,
  id = NULL,
  coords = NULL,
  timegroup,
  splitBy = NULL
)

Arguments

`DT`	input data.table
`threshold`	distance for grouping points, in the units of the coordinates
`id`	Character string of ID column name
`coords`	Character vector of X coordinate and Y coordinate column names
`timegroup`	timegroup field in the DT within which the grouping will be calculated
`splitBy`	(optional) character string or vector of grouping column name(s) upon which the grouping will be calculated

Details

The DT must be a data.table. If your data is a data.frame, you can convert it by reference using data.table::setDT.

The id, coords, timegroup (and optional splitBy) arguments expect the names of a column in DT which correspond to the individual identifier, X and Y coordinates, timegroup (typically generated by group_times) and additional grouping columns.

The threshold must be provided in the units of the coordinates. The threshold must be larger than 0. The coordinates must be planar coordinates (e.g.: UTM). In the case of UTM, a threshold = 50 would indicate a 50m distance threshold.

The timegroup argument is required to define the temporal groups within which spatial groups are calculated. The intended framework is to group rows temporally with group_times then spatially with group_pts (or group_lines, group_polys). If you have already calculated temporal groups without group_times, you can pass this column to the timegroup argument. Note that the expectation is that each individual will be observed only once per timegroup. Caution that accidentally including huge numbers of rows within timegroups can overload your machine since all pairwise distances are calculated within each timegroup.

The splitBy argument offers further control over grouping. If within your DT, you have multiple populations, subgroups or other distinct parts, you can provide the name of the column which identifies them to splitBy. The grouping performed by group_pts will only consider rows within each splitBy subgroup.

Value

group_pts returns the input DT appended with a group column.

This column represents the spatialtemporal group. As with the other grouping functions, the actual value of group is arbitrary and represents the identity of a given group where 1 or more individuals are assigned to a group. If the data was reordered, the group may change, but the contents of each group would not.

A message is returned when a column named group already exists in the input DT, because it will be overwritten.

Examples

# Load data.table
library(data.table)


# Read example data
DT <- fread(system.file("extdata", "DT.csv", package = "spatsoc"))

# Select only individuals A, B, C for this example
DT <- DT[ID %in% c('A', 'B', 'C')]

# Cast the character column to POSIXct
DT[, datetime := as.POSIXct(datetime, tz = 'UTC')]

# Temporal grouping
group_times(DT, datetime = 'datetime', threshold = '20 minutes')

# Spatial grouping with timegroup
group_pts(DT, threshold = 5, id = 'ID',
          coords = c('X', 'Y'), timegroup = 'timegroup')

# Spatial grouping with timegroup and splitBy on population
group_pts(DT, threshold = 5, id = 'ID', coords = c('X', 'Y'),
         timegroup = 'timegroup', splitBy = 'population')

[Package spatsoc version 0.2.2 Index]