derive_vars_merged {admiral} | R Documentation |
Add New Variable(s) to the Input Dataset Based on Variables from Another Dataset
Description
Add new variable(s) to the input dataset based on variables from another
dataset. The observations to merge can be selected by a condition
(filter_add
argument) and/or selecting the first or last observation for
each by group (order
and mode
argument).
Usage
derive_vars_merged(
dataset,
dataset_add,
by_vars,
order = NULL,
new_vars = NULL,
filter_add = NULL,
mode = NULL,
match_flag,
exist_flag = NULL,
true_value = "Y",
false_value = NA_character_,
missing_values = NULL,
check_type = "warning",
duplicate_msg = NULL,
relationship = NULL
)
Arguments
dataset |
Input dataset The variables specified by the |
dataset_add |
Additional dataset The variables specified by the |
by_vars |
Grouping variables The input dataset and the selected observations from the additional dataset are merged by the specified variables. Variables can be renamed by naming the element, i.e.
Permitted Values: list of variables created by |
order |
Sort order If the argument is set to a non-null value, for each by group the first or last observation from the additional dataset is selected with respect to the specified order. Variables defined by the For handling of Permitted Values: list of expressions created by |
new_vars |
Variables to add The specified variables from the additional dataset are added to the output
dataset. Variables can be renamed by naming the element, i.e., For example And Values of the added variables can be modified by specifying an expression.
For example, If the argument is not specified or set to Permitted Values: list of variables or named expressions created by |
filter_add |
Filter for additional dataset ( Only observations fulfilling the specified condition are taken into account for merging. If the argument is not specified, all observations are considered. Variables defined by the Permitted Values: a condition |
mode |
Selection mode Determines if the first or last observation is selected. If the If the Permitted Values: |
match_flag |
Match flag Please use If the argument is specified (e.g., Permitted Values: Variable name |
exist_flag |
Exist flag If the argument is specified (e.g., Permitted Values: Variable name |
true_value |
True value The value for the specified variable Permitted Values: An atomic scalar |
false_value |
False value The value for the specified variable Permitted Values: An atomic scalar |
missing_values |
Values for non-matching observations For observations of the input dataset ( Permitted Values: named list of expressions, e.g.,
|
check_type |
Check uniqueness? If If the Permitted Values: |
duplicate_msg |
Message of unique check If the uniqueness check fails, the specified message is displayed. Default: paste( "Dataset {.arg dataset_add} contains duplicate records with respect to", "{.var {vars2chr(by_vars)}}." ) |
relationship |
Expected merge-relationship between the This argument is passed to the Permitted Values: |
Details
The new variables (
new_vars
) are added to the additional dataset (dataset_add
).The records from the additional dataset (
dataset_add
) are restricted to those matching thefilter_add
condition.If
order
is specified, for each by group the first or last observation (depending onmode
) is selected.The variables specified for
new_vars
are merged to the input dataset usingleft_join()
. I.e., the output dataset contains all observations from the input dataset. For observations without a matching observation in the additional dataset the new variables are set as specified bymissing_values
(or toNA
for variables not inmissing_values
). Observations in the additional dataset which have no matching observation in the input dataset are ignored.
Value
The output dataset contains all observations and variables of the
input dataset and additionally the variables specified for new_vars
from
the additional dataset (dataset_add
).
See Also
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag()
,
derive_var_joined_exist_flag()
,
derive_var_merged_ef_msrc()
,
derive_var_merged_exist_flag()
,
derive_var_merged_summary()
,
derive_var_obs_number()
,
derive_var_relative_flag()
,
derive_vars_computed()
,
derive_vars_joined()
,
derive_vars_merged_lookup()
,
derive_vars_transposed()
Examples
library(dplyr, warn.conflicts = FALSE)
vs <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~VSTESTCD, ~VISIT, ~VSSTRESN, ~VSSTRESU, ~VSDTC,
"PILOT01", "VS", "01-1302", "HEIGHT", "SCREENING", 177.8, "cm", "2013-08-20",
"PILOT01", "VS", "01-1302", "WEIGHT", "SCREENING", 81.19, "kg", "2013-08-20",
"PILOT01", "VS", "01-1302", "WEIGHT", "BASELINE", 82.1, "kg", "2013-08-29",
"PILOT01", "VS", "01-1302", "WEIGHT", "WEEK 2", 81.19, "kg", "2013-09-15",
"PILOT01", "VS", "01-1302", "WEIGHT", "WEEK 4", 82.56, "kg", "2013-09-24",
"PILOT01", "VS", "01-1302", "WEIGHT", "WEEK 6", 80.74, "kg", "2013-10-08",
"PILOT01", "VS", "01-1302", "WEIGHT", "WEEK 8", 82.1, "kg", "2013-10-22",
"PILOT01", "VS", "01-1302", "WEIGHT", "WEEK 12", 82.1, "kg", "2013-11-05",
"PILOT01", "VS", "17-1344", "HEIGHT", "SCREENING", 163.5, "cm", "2014-01-01",
"PILOT01", "VS", "17-1344", "WEIGHT", "SCREENING", 58.06, "kg", "2014-01-01",
"PILOT01", "VS", "17-1344", "WEIGHT", "BASELINE", 58.06, "kg", "2014-01-11",
"PILOT01", "VS", "17-1344", "WEIGHT", "WEEK 2", 58.97, "kg", "2014-01-24",
"PILOT01", "VS", "17-1344", "WEIGHT", "WEEK 4", 57.97, "kg", "2014-02-07",
"PILOT01", "VS", "17-1344", "WEIGHT", "WEEK 6", 58.97, "kg", "2014-02-19",
"PILOT01", "VS", "17-1344", "WEIGHT", "WEEK 8", 57.79, "kg", "2014-03-14"
)
dm <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~AGE, ~AGEU,
"PILOT01", "DM", "01-1302", 61, "YEARS",
"PILOT01", "DM", "17-1344", 64, "YEARS"
)
# Merging all dm variables to vs
derive_vars_merged(
vs,
dataset_add = select(dm, -DOMAIN),
by_vars = exprs(STUDYID, USUBJID)
) %>%
select(STUDYID, USUBJID, VSTESTCD, VISIT, VSSTRESN, AGE, AGEU)
# Merge last weight to adsl
adsl <- tribble(
~STUDYID, ~USUBJID, ~AGE, ~AGEU,
"PILOT01", "01-1302", 61, "YEARS",
"PILOT01", "17-1344", 64, "YEARS"
)
derive_vars_merged(
adsl,
dataset_add = vs,
by_vars = exprs(STUDYID, USUBJID),
order = exprs(convert_dtc_to_dtm(VSDTC)),
mode = "last",
new_vars = exprs(LASTWGT = VSSTRESN, LASTWGTU = VSSTRESU),
filter_add = VSTESTCD == "WEIGHT",
exist_flag = vsdatafl
) %>%
select(STUDYID, USUBJID, AGE, AGEU, LASTWGT, LASTWGTU, vsdatafl)
# Derive treatment start datetime (TRTSDTM)
ex <- tribble(
~STUDYID, ~DOMAIN, ~USUBJID, ~EXSTDY, ~EXENDY, ~EXSTDTC, ~EXENDTC,
"PILOT01", "EX", "01-1302", 1, 18, "2013-08-29", "2013-09-15",
"PILOT01", "EX", "01-1302", 19, 69, "2013-09-16", "2013-11-05",
"PILOT01", "EX", "17-1344", 1, 14, "2014-01-11", "2014-01-24",
"PILOT01", "EX", "17-1344", 15, 63, "2014-01-25", "2014-03-14"
)
## Impute exposure start date to first date/time
ex_ext <- derive_vars_dtm(
ex,
dtc = EXSTDTC,
new_vars_prefix = "EXST",
highest_imputation = "M",
)
## Add first exposure datetime and imputation flags to adsl
derive_vars_merged(
select(dm, STUDYID, USUBJID),
dataset_add = ex_ext,
by_vars = exprs(STUDYID, USUBJID),
new_vars = exprs(TRTSDTM = EXSTDTM, TRTSDTF = EXSTDTF, TRTSTMF = EXSTTMF),
order = exprs(EXSTDTM),
mode = "first"
)
# Derive treatment end datetime (TRTEDTM)
## Impute exposure end datetime to last time, no date imputation
ex_ext <- derive_vars_dtm(
ex,
dtc = EXENDTC,
new_vars_prefix = "EXEN",
time_imputation = "last",
)
## Add last exposure datetime and imputation flag to adsl
derive_vars_merged(
select(adsl, STUDYID, USUBJID),
dataset_add = ex_ext,
filter_add = !is.na(EXENDTM),
by_vars = exprs(STUDYID, USUBJID),
new_vars = exprs(TRTEDTM = EXENDTM, TRTETMF = EXENTMF),
order = exprs(EXENDTM),
mode = "last"
)
# Modify merged values and set value for non matching observations
adsl <- tribble(
~USUBJID, ~SEX, ~COUNTRY,
"ST42-1", "F", "AUT",
"ST42-2", "M", "MWI",
"ST42-3", "M", "NOR",
"ST42-4", "F", "UGA"
)
advs <- tribble(
~USUBJID, ~PARAMCD, ~AVISIT, ~AVISITN, ~AVAL,
"ST42-1", "WEIGHT", "BASELINE", 0, 66,
"ST42-1", "WEIGHT", "WEEK 2", 1, 68,
"ST42-2", "WEIGHT", "BASELINE", 0, 88,
"ST42-3", "WEIGHT", "WEEK 2", 1, 55,
"ST42-3", "WEIGHT", "WEEK 4", 2, 50
)
derive_vars_merged(
adsl,
dataset_add = advs,
by_vars = exprs(USUBJID),
new_vars = exprs(
LSTVSCAT = if_else(AVISIT == "BASELINE", "BASELINE", "POST-BASELINE")
),
order = exprs(AVISITN),
mode = "last",
missing_values = exprs(LSTVSCAT = "MISSING")
)