workflowTransfer {distantia} | R Documentation |
Transfers an attribute (time, age, depth) from one sequence to another
Description
Transfers an attribute (generally time/age, but any others are possible) from one sequence (defined by the argument transfer.from
) to another (defined by the argument transfer.to
) lacking it. The transference of the attribute is based on the following assumption: similar samples have similar attributes. This assumption might not hold for noisy multivariate time-series. Attribute transference can be done in two different ways (defined by the mode
argument):
-
Direct: transfers the selected attribute between samples with the maximum similarity. This option will likely generate duplicated attribute values in the output.
-
Interpolate: obtains new attribute values through weighted interpolation, being the weights derived from the distances between samples
Usage
workflowTransfer(
sequences = NULL,
grouping.column = NULL,
time.column = NULL,
exclude.columns = NULL,
method = "manhattan",
transfer.what = NULL,
transfer.from = NULL,
transfer.to = NULL,
mode = "direct",
plot = FALSE
)
Arguments
sequences |
dataframe with multiple sequences identified by a grouping column generated by |
grouping.column |
character string, name of the column in |
time.column |
character string, name of the column with time/depth/rank data. |
exclude.columns |
character string or character vector with column names in |
method |
character string naming a distance metric. Valid entries are: "manhattan", "euclidean", "chi", and "hellinger". Invalid entries will throw an error. |
transfer.what |
character string, column of |
transfer.from |
character string, group available in |
transfer.to |
character string, group available in |
mode |
character string, one of: "direct" (default), "interpolate". |
plot |
boolean, if |
Value
A dataframe with the sequence transfer.to
, with a column named after transfer.what
with the attribute values.
Author(s)
Blas Benito <blasbenito@gmail.com>
Examples
#loading sample dataset
data(pollenGP)
#subset pollenGP to make a shorter dataset
pollenGP <- pollenGP[1:50, ]
#generating a subset of pollenGP
set.seed(10)
pollenX <- pollenGP[sort(sample(1:50, 40)), ]
#we separate the age column
pollenX.age <- pollenX$age
#and remove the age values from pollenX
pollenX$age <- NULL
pollenX$depth <- NULL
#removing some samples from pollenGP
#so pollenX is not a perfect subset of pollenGP
pollenGP <- pollenGP[-sample(1:50, 10), ]
#prepare sequences
GP.X <- prepareSequences(
sequence.A = pollenGP,
sequence.A.name = "GP",
sequence.B = pollenX,
sequence.B.name = "X",
grouping.column = "id",
time.column = "age",
exclude.columns = "depth",
transformation = "none"
)
#transferring age
X.new <- workflowTransfer(
sequences = GP.X,
grouping.column = "id",
time.column = "age",
method = "manhattan",
transfer.what = "age",
transfer.from = "GP",
transfer.to = "X",
mode = "interpolated"
)