derive_vars_joined {admiral} | R Documentation |
The function adds variables from an additional dataset to the input dataset. The selection of the observations from the additional dataset can depend on variables from both datasets. For example, add the lowest value (nadir) before the current observation.
derive_vars_joined(
dataset,
dataset_add,
by_vars = NULL,
order = NULL,
new_vars = NULL,
join_vars = NULL,
filter_add = NULL,
filter_join = NULL,
mode = NULL,
missing_values = NULL,
check_type = "warning"
)
dataset |
Input dataset The variables specified by |
dataset_add |
Additional dataset The variables specified by the |
by_vars |
Grouping variables The two datasets are joined by the specified variables. Variables from the
additional dataset can be renamed by naming the element, i.e., Permitted Values: list of variables created by |
order |
Sort order If the argument is set to a non-null value, for each observation of the
input dataset the first or last observation from the joined dataset is
selected with respect to the specified order. The specified variables are
expected in the additional dataset ( If an expression is named, e.g., Permitted Values: list of expressions created by |
new_vars |
Variables to add The specified variables from the additional dataset are added to the output
dataset. Variables can be renamed by naming the element, i.e., For example And Values of the added variables can be modified by specifying an expression.
For example, If the argument is not specified or set to Permitted Values: list of variables or named expressions created by |
join_vars |
Variables to use from additional dataset Any extra variables required from the additional dataset for If an expression is named, e.g., The variables are not included in the output dataset. Permitted Values: list of variables or named expressions created by |
filter_add |
Filter for additional dataset ( Only observations from Variables created by Permitted Values: a condition |
filter_join |
Filter for the joined dataset The specified condition is applied to the joined dataset. Therefore
variables from both datasets Variables created by Permitted Values: a condition |
mode |
Selection mode Determines if the first or last observation is selected. If the If the Permitted Values: |
missing_values |
Values for non-matching observations For observations of the input dataset ( Permitted Values: named list of expressions, e.g.,
|
check_type |
Check uniqueness? If This argument is ignored if Permitted Values: |
The variables specified by order
are added to the additional dataset
(dataset_add
).
The variables specified by join_vars
are added to the additional dataset
(dataset_add
).
The records from the additional dataset (dataset_add
) are restricted to
those matching the filter_add
condition.
The input dataset and the (restricted) additional dataset are left joined
by the grouping variables (by_vars
). If no grouping variables are
specified, a full join is performed.
The joined dataset is restricted by the filter_join
condition.
If order
is specified, for each observation of the input dataset the
first or last observation (depending on mode
) is selected.
The variables specified for new_vars
are created (if requested) and
merged to the input dataset. I.e., the output dataset contains all
observations from the input dataset. For observations without a matching
observation in the joined dataset the new variables are set as specified by
missing_values
(or to NA
for variables not in missing_values
).
Observations in the additional dataset which have no matching observation in
the input dataset are ignored.
The output dataset contains all observations and variables of the
input dataset and additionally the variables specified for new_vars
from
the additional dataset (dataset_add
).
General Derivation Functions for all ADaMs that returns variable appended to dataset:
derive_var_extreme_flag()
,
derive_var_joined_exist_flag()
,
derive_var_merged_exist_flag()
,
derive_var_merged_summary()
,
derive_var_obs_number()
,
derive_var_relative_flag()
,
derive_vars_merged_lookup()
,
derive_vars_merged()
,
derive_vars_transposed()
,
get_summary_records()
library(tibble)
library(lubridate)
library(dplyr, warn.conflicts = FALSE)
library(tidyr)
# Add AVISIT (based on time windows), AWLO, and AWHI
adbds <- tribble(
~USUBJID, ~ADY,
"1", -33,
"1", -2,
"1", 3,
"1", 24,
"2", NA,
)
windows <- tribble(
~AVISIT, ~AWLO, ~AWHI,
"BASELINE", -30, 1,
"WEEK 1", 2, 7,
"WEEK 2", 8, 15,
"WEEK 3", 16, 22,
"WEEK 4", 23, 30
)
derive_vars_joined(
adbds,
dataset_add = windows,
filter_join = AWLO <= ADY & ADY <= AWHI
)
# derive the nadir after baseline and before the current observation
adbds <- tribble(
~USUBJID, ~ADY, ~AVAL,
"1", -7, 10,
"1", 1, 12,
"1", 8, 11,
"1", 15, 9,
"1", 20, 14,
"1", 24, 12,
"2", 13, 8
)
derive_vars_joined(
adbds,
dataset_add = adbds,
by_vars = exprs(USUBJID),
order = exprs(AVAL),
new_vars = exprs(NADIR = AVAL),
join_vars = exprs(ADY),
filter_add = ADY > 0,
filter_join = ADY.join < ADY,
mode = "first",
check_type = "none"
)
# add highest hemoglobin value within two weeks before AE,
# take earliest if more than one
adae <- tribble(
~USUBJID, ~ASTDY,
"1", 3,
"1", 22,
"2", 2
)
adlb <- tribble(
~USUBJID, ~PARAMCD, ~ADY, ~AVAL,
"1", "HGB", 1, 8.5,
"1", "HGB", 3, 7.9,
"1", "HGB", 5, 8.9,
"1", "HGB", 8, 8.0,
"1", "HGB", 9, 8.0,
"1", "HGB", 16, 7.4,
"1", "HGB", 24, 8.1,
"1", "ALB", 1, 42,
)
derive_vars_joined(
adae,
dataset_add = adlb,
by_vars = exprs(USUBJID),
order = exprs(AVAL, desc(ADY)),
new_vars = exprs(HGB_MAX = AVAL, HGB_DY = ADY),
filter_add = PARAMCD == "HGB",
filter_join = ASTDY - 14 <= ADY & ADY <= ASTDY,
mode = "last"
)
# Add APERIOD, APERIODC based on ADSL
adsl <- tribble(
~USUBJID, ~AP01SDT, ~AP01EDT, ~AP02SDT, ~AP02EDT,
"1", "2021-01-04", "2021-02-06", "2021-02-07", "2021-03-07",
"2", "2021-02-02", "2021-03-02", "2021-03-03", "2021-04-01"
) %>%
mutate(across(ends_with("DT"), ymd)) %>%
mutate(STUDYID = "xyz")
period_ref <- create_period_dataset(
adsl,
new_vars = exprs(APERSDT = APxxSDT, APEREDT = APxxEDT)
)
period_ref
adae <- tribble(
~USUBJID, ~ASTDT,
"1", "2021-01-01",
"1", "2021-01-05",
"1", "2021-02-05",
"1", "2021-03-05",
"1", "2021-04-05",
"2", "2021-02-15",
) %>%
mutate(
ASTDT = ymd(ASTDT),
STUDYID = "xyz"
)
derive_vars_joined(
adae,
dataset_add = period_ref,
by_vars = exprs(STUDYID, USUBJID),
join_vars = exprs(APERSDT, APEREDT),
filter_join = APERSDT <= ASTDT & ASTDT <= APEREDT
)
# Add day since last dose (LDRELD)
adae <- tribble(
~USUBJID, ~ASTDT, ~AESEQ,
"1", "2020-02-02", 1,
"1", "2020-02-04", 2
) %>%
mutate(ASTDT = ymd(ASTDT))
ex <- tribble(
~USUBJID, ~EXSDTC,
"1", "2020-01-10",
"1", "2020-01",
"1", "2020-01-20",
"1", "2020-02-03"
)
## Please note that EXSDT is created via the order argument and then used
## for new_vars, filter_add, and filter_join
derive_vars_joined(
adae,
dataset_add = ex,
by_vars = exprs(USUBJID),
order = exprs(EXSDT = convert_dtc_to_dt(EXSDTC)),
new_vars = exprs(LDRELD = compute_duration(
start_date = EXSDT, end_date = ASTDT
)),
filter_add = !is.na(EXSDT),
filter_join = EXSDT <= ASTDT,
mode = "last"
)