workflowSlotting {distantia}R Documentation

Slots two sequences into a single composite sequence.

Description

Generates a composite sequence, constrained by sample order, from two sequences, by minimizing the dissimilarity between adjacent samples of each input sequence. The algorithm computes the distance matrix, least cost matrix, and least cost path of two sequences, and uses the least cost path file to find the slotting that better minimizes the dissimilarity between adjacent samples. The algorithm assumes that the samples are not aligned or paired.

Usage

workflowSlotting(
  sequences = NULL,
  grouping.column = NULL,
  time.column = NULL,
  exclude.columns = NULL,
  method = "manhattan",
  plot = TRUE
  )

Arguments

sequences

dataframe with two sequences identified by a grouping column generated by prepareSequences.

grouping.column

character string, name of the column in sequences to be used to identify separates sequences within the file.

time.column

character string, name of the column with time/depth/rank data.

exclude.columns

character string or character vector with column names in sequences to be excluded from the analysis.

method

character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error.

plot

boolean, if TRUE, plots the distance matrix and the least-cost path.

Value

A dataframe with the same number of rows as sequences, ordered according to the best solution found by the least-cost algorithm.

Author(s)

Blas Benito <blasbenito@gmail.com>

Examples



#loading the data
data(pollenGP)

#getting first 20 samples
pollenGP <- pollenGP[1:20, ]

#sampling indices
set.seed(10) #to get same result every time
sampling.indices <- sort(sample(1:20, 10))

#subsetting the sequence
A <- pollenGP[sampling.indices, ]
B <- pollenGP[-sampling.indices, ]

#preparing the sequences
AB <- prepareSequences(
  sequence.A = A,
  sequence.A.name = "A",
  sequence.B = B,
  sequence.B.name = "B",
  grouping.column = "id",
  exclude.columns = c("depth", "age"),
  transformation = "hellinger"
  )

AB.combined <- workflowSlotting(
  sequences = AB,
  grouping.column = "id",
  time.column = "age",
  exclude.columns = "depth",
  method = "manhattan",
  plot = TRUE
  )

AB.combined




[Package distantia version 1.0.2 Index]