workflowPsiHP {distantia} | R Documentation |
workflowPsi
with a higher performance (hence the suffix HP).Ideal for large analyses with hundreds to thousands of sequences. Several options available in workflowPsi
have been removed from this function in order to simplify the code as much as possible. Psi is computed with the options diagonal = TRUE
, ignore.blocks = TRUE
, and method = "euclidean"
.
workflowPsiHP(
sequences = NULL,
grouping.column = NULL,
time.column = NULL,
exclude.columns = NULL,
parallel.execution = TRUE
)
sequences |
dataframe with multiple sequences identified by a grouping column generated by |
grouping.column |
character string, name of the column in |
time.column |
character string, name of the column with time/depth/rank data. |
exclude.columns |
character string or character vector with column names in |
parallel.execution |
boolean, if |
Due to limitations of the function permutations
, the maximum number of groups (according to grouping.column
) is around 30000. Besides, a combinations table of this size takes, roughlyl, 7GB of memory.
A dataframe with sequence names and psi values.
Blas Benito <blasbenito@gmail.com>
data("sequencesMIS")
#prepare sequences
MIS.sequences <- prepareSequences(
sequences = sequencesMIS[sequencesMIS$MIS %in% c("MIS-1", "MIS-2"), ],
grouping.column = "MIS",
if.empty.cases = "zero",
transformation = "hellinger"
)
#execute workflow to compute psi
MIS.psi <- workflowPsiHP(
sequences = MIS.sequences,
grouping.column = "MIS",
parallel.execution = FALSE
)
MIS.psi