filter_joined {admiral} | R Documentation |
The function filters observation using a condition taking other observations
into account. For example, it could select all observations with AVALC == "Y"
and AVALC == "Y"
for at least one subsequent observation. The input
dataset is joined with itself to enable conditions taking variables from both
the current observation and the other observations into account. The suffix
".join" is added to the variables from the subsequent observations.
An example usage might be checking if a patient received two required medications within a certain timeframe of each other.
In the oncology setting, for example, we use such processing to check if a response value can be confirmed by a subsequent assessment. This is commonly used in endpoints such as best overall response.
filter_joined(
dataset,
by_vars,
join_vars,
join_type,
first_cond = NULL,
order,
tmp_obs_nr_var = NULL,
filter,
check_type = "warning"
)
dataset |
Input dataset The variables specified for |
by_vars |
By variables The specified variables are used as by variables for joining the input dataset with itself. |
join_vars |
Variables to keep from joined dataset The variables needed from the other observations should be specified for
this parameter. The specified variables are added to the joined dataset
with suffix ".join". For example to select all observations with The |
join_type |
Observations to keep after joining The argument determines which of the joined observations are kept with
respect to the original observation. For example, if Permitted Values: |
first_cond |
Condition for selecting range of data If this argument is specified, the other observations are restricted up to the first observation where the specified condition is fulfilled. If the condition is not fulfilled for any of the subsequent observations, all observations are removed. |
order |
Order The observations are ordered by the specified order. Permitted Values: list of expressions created by |
tmp_obs_nr_var |
Temporary observation number The specified variable is added to the input dataset and set to the
observation number with respect to |
filter |
Condition for selecting observations The filter is applied to the joined dataset for selecting the confirmed
observations. The condition can include summary functions. The joined
dataset is grouped by the original observations. I.e., the summary function
are applied to all observations up to the confirmation observation. For
example in the oncology setting when using this function for confirmed best
overall response, |
check_type |
Check uniqueness? If Default: Permitted Values: |
The following steps are performed to produce the output dataset.
The input dataset is joined with itself by the variables specified for
by_vars
. From the right hand side of the join only the variables
specified for join_vars
are kept. The suffix ".join" is added to these
variables.
For example, for by_vars = USUBJID
, join_vars = exprs(AVISITN, AVALC)
and input dataset
# A tibble: 2 x 4 USUBJID AVISITN AVALC AVAL <chr> <dbl> <chr> <dbl> 1 1 Y 1 1 2 N 0
the joined dataset is
A tibble: 4 x 6 USUBJID AVISITN AVALC AVAL AVISITN.join AVALC.join <chr> <dbl> <chr> <dbl> <dbl> <chr> 1 1 Y 1 1 Y 1 1 Y 1 2 N 1 2 N 0 1 Y 1 2 N 0 2 N
The joined dataset is restricted to observations with respect to
join_type
and order
.
The dataset from the example in the previous step with join_type = "after"
and order = exprs(AVISITN)
is restricted to
A tibble: 4 x 6 USUBJID AVISITN AVALC AVAL AVISITN.join AVALC.join <chr> <dbl> <chr> <dbl> <dbl> <chr> 1 1 Y 1 2 N
If first_cond
is specified, for each observation of the input dataset the
joined dataset is restricted to observations up to the first observation
where first_cond
is fulfilled (the observation fulfilling the condition
is included). If for an observation of the input dataset the condition is
not fulfilled, the observation is removed.
The joined dataset is grouped by the observations from the input dataset
and restricted to the observations fulfilling the condition specified by
filter
.
The first observation of each group is selected and the *.join
variables
are dropped.
A subset of the observations of the input dataset. All variables of the input dataset are included in the output dataset.
count_vals()
, min_cond()
, max_cond()
Utilities for Filtering Observations:
count_vals()
,
filter_exist()
,
filter_extreme()
,
filter_not_exist()
,
filter_relative()
,
max_cond()
,
min_cond()
library(tibble)
library(admiral)
# filter observations with a duration longer than 30 and
# on or after 7 days before a COVID AE (ACOVFL == "Y")
adae <- tribble(
~USUBJID, ~ADY, ~ACOVFL, ~ADURN,
"1", 10, "N", 1,
"1", 21, "N", 50,
"1", 23, "Y", 14,
"1", 32, "N", 31,
"1", 42, "N", 20,
"2", 11, "Y", 13,
"2", 23, "N", 2,
"3", 13, "Y", 12,
"4", 14, "N", 32,
"4", 21, "N", 41
)
filter_joined(
adae,
by_vars = exprs(USUBJID),
join_vars = exprs(ACOVFL, ADY),
join_type = "all",
order = exprs(ADY),
filter = ADURN > 30 & ACOVFL.join == "Y" & ADY >= ADY.join - 7
)
# filter observations with AVALC == "Y" and AVALC == "Y" at a subsequent visit
data <- tribble(
~USUBJID, ~AVISITN, ~AVALC,
"1", 1, "Y",
"1", 2, "N",
"1", 3, "Y",
"1", 4, "N",
"2", 1, "Y",
"2", 2, "N",
"3", 1, "Y",
"4", 1, "N",
"4", 2, "N",
)
filter_joined(
data,
by_vars = exprs(USUBJID),
join_vars = exprs(AVALC, AVISITN),
join_type = "after",
order = exprs(AVISITN),
filter = AVALC == "Y" & AVALC.join == "Y" & AVISITN < AVISITN.join
)
# select observations with AVALC == "CR", AVALC == "CR" at a subsequent visit,
# only "CR" or "NE" in between, and at most one "NE" in between
data <- tribble(
~USUBJID, ~AVISITN, ~AVALC,
"1", 1, "PR",
"1", 2, "CR",
"1", 3, "NE",
"1", 4, "CR",
"1", 5, "NE",
"2", 1, "CR",
"2", 2, "PR",
"2", 3, "CR",
"3", 1, "CR",
"4", 1, "CR",
"4", 2, "NE",
"4", 3, "NE",
"4", 4, "CR",
"4", 5, "PR"
)
filter_joined(
data,
by_vars = exprs(USUBJID),
join_vars = exprs(AVALC),
join_type = "after",
order = exprs(AVISITN),
first_cond = AVALC.join == "CR",
filter = AVALC == "CR" & all(AVALC.join %in% c("CR", "NE")) &
count_vals(var = AVALC.join, val = "NE") <= 1
)
# select observations with AVALC == "PR", AVALC == "CR" or AVALC == "PR"
# at a subsequent visit at least 20 days later, only "CR", "PR", or "NE"
# in between, at most one "NE" in between, and "CR" is not followed by "PR"
data <- tribble(
~USUBJID, ~ADY, ~AVALC,
"1", 6, "PR",
"1", 12, "CR",
"1", 24, "NE",
"1", 32, "CR",
"1", 48, "PR",
"2", 3, "PR",
"2", 21, "CR",
"2", 33, "PR",
"3", 11, "PR",
"4", 7, "PR",
"4", 12, "NE",
"4", 24, "NE",
"4", 32, "PR",
"4", 55, "PR"
)
filter_joined(
data,
by_vars = exprs(USUBJID),
join_vars = exprs(AVALC, ADY),
join_type = "after",
order = exprs(ADY),
first_cond = AVALC.join %in% c("CR", "PR") & ADY.join - ADY >= 20,
filter = AVALC == "PR" &
all(AVALC.join %in% c("CR", "PR", "NE")) &
count_vals(var = AVALC.join, val = "NE") <= 1 &
(
min_cond(var = ADY.join, cond = AVALC.join == "CR") >
max_cond(var = ADY.join, cond = AVALC.join == "PR") |
count_vals(var = AVALC.join, val = "CR") == 0
)
)
# select observations with CRIT1FL == "Y" at two consecutive visits or at the last visit
data <- tribble(
~USUBJID, ~AVISITN, ~CRIT1FL,
"1", 1, "Y",
"1", 2, "N",
"1", 3, "Y",
"1", 5, "N",
"2", 1, "Y",
"2", 3, "Y",
"2", 5, "N",
"3", 1, "Y",
"4", 1, "Y",
"4", 2, "N",
)
filter_joined(
data,
by_vars = exprs(USUBJID),
tmp_obs_nr_var = tmp_obs_nr,
join_vars = exprs(CRIT1FL),
join_type = "all",
order = exprs(AVISITN),
filter = CRIT1FL == "Y" & CRIT1FL.join == "Y" &
(tmp_obs_nr + 1 == tmp_obs_nr.join | tmp_obs_nr == max(tmp_obs_nr.join))
)