workflowPsiHP {distantia} | R Documentation |
A refactored version of workflowPsi
with a higher performance (hence the suffix HP).
Description
Ideal for large analyses with hundreds to thousands of sequences. Several options available in workflowPsi
have been removed from this function in order to simplify the code as much as possible. Psi is computed with the options diagonal = TRUE
, ignore.blocks = TRUE
, and method = "euclidean"
.
Usage
workflowPsiHP(
sequences = NULL,
grouping.column = NULL,
time.column = NULL,
exclude.columns = NULL,
parallel.execution = TRUE
)
Arguments
sequences |
dataframe with multiple sequences identified by a grouping column generated by |
grouping.column |
character string, name of the column in |
time.column |
character string, name of the column with time/depth/rank data. |
exclude.columns |
character string or character vector with column names in |
parallel.execution |
boolean, if |
Details
Due to limitations of the function permutations
, the maximum number of groups (according to grouping.column
) is around 30000. Besides, a combinations table of this size takes, roughlyl, 7GB of memory.
Value
A dataframe with sequence names and psi values.
Author(s)
Blas Benito <blasbenito@gmail.com>
Examples
data("sequencesMIS")
#prepare sequences
MIS.sequences <- prepareSequences(
sequences = sequencesMIS[sequencesMIS$MIS %in% c("MIS-1", "MIS-2"), ],
grouping.column = "MIS",
if.empty.cases = "zero",
transformation = "hellinger"
)
#execute workflow to compute psi
MIS.psi <- workflowPsiHP(
sequences = MIS.sequences,
grouping.column = "MIS",
parallel.execution = FALSE
)
MIS.psi