R: Slots two sequences into a single composite sequence.

workflowSlotting {distantia}

R Documentation

Slots two sequences into a single composite sequence.

Description

Generates a composite sequence, constrained by sample order, from two sequences, by minimizing the dissimilarity between adjacent samples of each input sequence. The algorithm computes the distance matrix, least cost matrix, and least cost path of two sequences, and uses the least cost path file to find the slotting that better minimizes the dissimilarity between adjacent samples. The algorithm assumes that the samples are not aligned or paired.

Usage

workflowSlotting(
  sequences = NULL,
  grouping.column = NULL,
  time.column = NULL,
  exclude.columns = NULL,
  method = "manhattan",
  plot = TRUE
  )

Arguments

`sequences`	dataframe with two sequences identified by a grouping column generated by `prepareSequences`.
`grouping.column`	character string, name of the column in `sequences` to be used to identify separates sequences within the file.
`time.column`	character string, name of the column with time/depth/rank data.
`exclude.columns`	character string or character vector with column names in `sequences` to be excluded from the analysis.
`method`	character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error.
`plot`	boolean, if `TRUE`, plots the distance matrix and the least-cost path.

Value

A dataframe with the same number of rows as sequences, ordered according to the best solution found by the least-cost algorithm.

Author(s)

Blas Benito <blasbenito@gmail.com>

Examples



#loading the data
data(pollenGP)

#getting first 20 samples
pollenGP <- pollenGP[1:20, ]

#sampling indices
set.seed(10) #to get same result every time
sampling.indices <- sort(sample(1:20, 10))

#subsetting the sequence
A <- pollenGP[sampling.indices, ]
B <- pollenGP[-sampling.indices, ]

#preparing the sequences
AB <- prepareSequences(
  sequence.A = A,
  sequence.A.name = "A",
  sequence.B = B,
  sequence.B.name = "B",
  grouping.column = "id",
  exclude.columns = c("depth", "age"),
  transformation = "hellinger"
  )

AB.combined <- workflowSlotting(
  sequences = AB,
  grouping.column = "id",
  time.column = "age",
  exclude.columns = "depth",
  method = "manhattan",
  plot = TRUE
  )

AB.combined

[Package distantia version 1.0.2 Index]