| distancePairedSamples {distantia} | R Documentation | 
Computes distance among pairs of aligned samples in two or more multivariate time-series.
Description
Computes the distance (one of: "manhattan", "euclidean", "chi", or "hellinger") between pairs of aligned samples (same order/depth/age) in two or more multivariate time-series.
Usage
distancePairedSamples(
  sequences = NULL,
  grouping.column = NULL,
  time.column = NULL,
  exclude.columns = NULL,
  same.time = FALSE,
  method = "manhattan",
  sum.distances = FALSE,
  parallel.execution = TRUE
  )
Arguments
| sequences | dataframe with multiple sequences identified by a grouping column. Generally the ouput of  | 
| grouping.column | character string, name of the column in  | 
| time.column | character string, name of the column with time/depth/rank data. The data in this column is not modified. | 
| exclude.columns | character string or character vector with column names in  | 
| same.time | boolean. If  | 
| method | character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error. | 
| sum.distances | boolean, if  | 
| parallel.execution | boolean, if  | 
Details
Distances are computed as:
-  manhattan:d <- sum(abs(x - y))
-  euclidean:d <- sqrt(sum((x - y)^2))
-  chi:xy <- x + y y. <- y / sum(y) x. <- x / sum(x) d <- sqrt(sum(((x. - y.)^2) / (xy / sum(xy))))
-  hellinger:d <- sqrt(1/2 * sum(sqrt(x) - sqrt(y))^2)
Note that zeroes are replaced by 0.00001 whem method equals "chi" or "hellinger".
Value
A list with named slots (names of the sequences separated by a vertical line, as in "A|B") containing numeric vectors with the distance between paired samples of every possible combination of sequences according to grouping.column.
Author(s)
Blas Benito <blasbenito@gmail.com>
See Also
Examples
#loading data
data(climate)
#preparing sequences
#notice the argument paired.samples
climate.prepared <- prepareSequences(
  sequences = climate,
  grouping.column = "sequenceId",
  time.column = "time",
  paired.samples = TRUE
  )
#compute pairwise distances between paired samples
climate.prepared.distances <- distancePairedSamples(
  sequences = climate.prepared,
  grouping.column = "sequenceId",
  time.column = "time",
  exclude.columns = NULL,
  method = "manhattan",
  sum.distances = FALSE,
  parallel.execution = FALSE
  )