impute_missing_visits {CTNote} | R Documentation |
Naively Impute Missing Visits
Description
Given a use pattern string with missing visits, make naive imputations for each missing visit
Usage
impute_missing_visits(
use_pattern,
method = c("locf", "locfD", "mode", "kNV"),
missing_is = "o",
mixed_is = "*",
tiebreaker = "+",
k = 1,
knvWeights_num = c(o = NA, `+` = 1, `*` = 0.5, `-` = 0),
quietly = FALSE
)
Arguments
use_pattern |
A character string showing the daily, by visit, or weekly substance use pattern for a single subject |
method |
Which naive imputation method should be used? Current supported
options are |
missing_is |
Which single character is used to mark missing UDS in a
use pattern string? Defaults to |
mixed_is |
Which single character is used to mark mixed UDS (both
positive and negative UDS for the visit block) in a use pattern string?
Defaults to |
tiebreaker |
In the event of ties between two modes, should positive or
negative UDS be the mode? Defaults to positive ( |
k |
The number of nearest visits to use in kNV imputation. This defaults to 1; we recommend that this parameter stays at 1 unless the use patterns in your data have extraordinarily few missing values. |
knvWeights_num |
A named vector matching the use pattern word "letters"
to their numerical use values. The names of this vector should match the
"letters" of the use pattern word exactly; use backticks to escape special
characters. For example, if the study protocol counts a mixed result (one
positive and one negative UDS in a single observation period [week]) as
worth three "use days", then mixed results should have a weight of 3/7.
Additionally, a study protocol may count missing values as five "use days"
out of a week. The defaults for this function are to leave |
quietly |
Should warning messages be muted? Defaults to |
Details
If you would like to replace all UDS for missing visits with a
single, pre-specified value (such as positive), please use
recode_missing_visits
instead. Furthermore, there will most
likely still be missing values in the use pattern even after imputation.
This would occur if all the values are missing, if the first values of the
use pattern are missing (if LOCF is used), if the first and/or last values
of the use pattern are missing (if LOCF-D is used), or if there are back to
back missing visits (if kNV with k = 1
is used). Because of this,
you may need to call recode_missing_visits
in a pipeline
after this function to replace or remove the remaining non-imputable
missing visits.
If you are using the kNV imputation option, there are some caveats to
consider. Due to rounding rules, any rounding ties are broken by order of
the values to the knvWeights_num
vector. For instance, consider a
subject who had a negative UDS in one week, then a missing UDS for the next
week, and then two UDS in the following week (of which one was positive and
the other was negative). This is represented by the use pattern
"-o*"
. The default behavior of the kNV method is to impute this to
"-**"
because the order of the knvWeights_num
vector has
"+"
, then "*"
, then "-"
UDS values. In this order, a
positive result trumps a mixed result, and a mixed result trumps a negative
result. Similarly, the use pattern "+o*"
will be imputed to
"++*"
by default.
At current, we allow for many symbols in the use pattern "word", such as "_" for missing by study design, "o" missing for protocol non-compliance (the most common form of missing), "+" for positive, "-" for negative, and "*" for mixed positive and negative results (this usually comes up when the visit represents multiple days and there are both positive and negative results in those days; for example, a subject is tested weekly; they provided a positive test on Tuesday but came back to provide a negative test the following day).
Value
A use pattern string the same length as use_pattern
with
missing values imputed according to the chosen imputation method.
Examples
pattern_char <- "__++++*o-------+--+-o-o-o+o+oooooo"
impute_missing_visits(pattern_char)
impute_missing_visits(pattern_char, method = "locfD")
impute_missing_visits(pattern_char, method = "mode")
pattern2_char <- "ooooooooooo"
impute_missing_visits(pattern2_char)