codify {coder} | R Documentation |
Codify case data with external code data (within specified time frames)
Description
This is the first step of codify() %>% classify() %>% index()
.
The function combines case data from one data set with related code data from
a second source, possibly limited to codes valid at certain time points
relative to case dates.
Usage
codify(x, codedata, ..., id, code, date = NULL, code_date = NULL, days = NULL)
## S3 method for class 'data.frame'
codify(x, ..., id, date = NULL, days = NULL)
## S3 method for class 'data.table'
codify(
x,
codedata,
...,
id,
code,
date = NULL,
code_date = NULL,
days = NULL,
alnum = FALSE,
.copy = NA
)
## S3 method for class 'codified'
print(x, ..., n = 10)
Arguments
x |
data set with mandatory character id column
(identified by argument |
codedata |
additional data with columns
including case id ( |
... |
arguments passed between methods |
id , code , date , code_date |
column names with case id
( |
days |
numeric vector of length two with lower and upper bound for range
of relevant days relative to |
alnum |
Should codes be cleaned from all non alphanumeric characters? |
.copy |
Should the object be copied internally by |
n |
number of rows to preview as tibble.
The output is technically a data.table::data.table, which might be an
unusual format to look at. Use |
Value
Object of class codified
(inheriting from data.table::data.table).
Essentially x
with additional columns:
code, code_date
: left joined from codedata
or NA
if no match within period. in_period
: Boolean indicator if the case
had at least one code within the specified period.
The output has one row for each combination of "id" from x
and
"code" from codedata
. Rows from x
might be repeated
accordingly.
Relevant period
Some examples for argument days
:
-
c(-365, -1)
: window of one year prior to thedate
column ofx
. Useful for patient comorbidity. -
c(1, 30)
: window of 30 days afterdate
. Useful for adverse events after a surgical procedure. -
c(-Inf, Inf)
: no limitation on non-missing dates. -
NULL
: no time limitation at all.
See Also
Other verbs:
categorize()
,
classify()
,
index_fun
Examples
# Codify all patients from `ex_people` with their ICD-10 codes from `ex_icd10`
x <- codify(ex_people, ex_icd10, id = "name", code = "icd10")
x
# Only consider codes if recorded at hospital admissions within one year prior
# to surgery
codify(
ex_people,
ex_icd10,
id = "name",
code = "icd10",
date = "surgery",
code_date = "admission",
days = c(-365, 0) # admission during one year before surgery
)
# Only consider codes if recorded after surgery
codify(
ex_people,
ex_icd10,
id = "name",
code = "icd10",
date = "surgery",
code_date = "admission",
days = c(1, Inf) # admission any time after surgery
)
# Dirty code data ---------------------------------------------------------
# Assume that codes contain unwanted "dirty" characters
# Those could for example be a dot used by ICD-10 (i.e. X12.3 instead of X123)
dirt <- c(strsplit(c("!#%&/()=?`,.-_"), split = ""), recursive = TRUE)
rdirt <- function(x) sample(x, nrow(ex_icd10), replace = TRUE)
sub <- function(i) substr(ex_icd10$icd10, i, i)
ex_icd10$icd10 <-
paste0(
rdirt(dirt), sub(1),
rdirt(dirt), sub(2),
rdirt(dirt), sub(3),
rdirt(dirt), sub(4),
rdirt(dirt), sub(5)
)
head(ex_icd10)
# Use `alnum = TRUE` to ignore non alphanumeric characters
codify(ex_people, ex_icd10, id = "name", code = "icd10", alnum = TRUE)
# Big data ----------------------------------------------------------------
# If `data` or `codedata` are large compared to available
# Random Access Memory (RAM) it might not be possible to make internal copies
# of those objects. Setting `.copy = FALSE` might help to overcome such problems
# If no copies are made internally, however, the input objects (if data tables)
# would change in the global environment
x2 <- data.table::as.data.table(ex_icd10)
head(x2) # Look at the "icd10" column (with dirty data)
# Use `alnum = TRUE` combined with `.copy = FALSE`
codify(ex_people, x2, id = "name", code = "icd10", alnum = TRUE, .copy = FALSE)
# Even though no explicit assignment was specified
# (neither for the output of codify(), nor to explicitly alter `x2`,
# the `x2` object has changed (look at the "icd10" column!):
head(x2)
# Hence, the `.copy` argument should only be used if necessary
# and if so, with caution!
# print.codify() ----------------------------------------------------------
x # Preview first 10 rows as a tibble
print(x, n = 20) # Preview first 20 rows as a tibble
print(x, n = NULL) # Print as data.table (ignoring the 'classified' class)