bind_clinspacy_embeddings {clinspacy} | R Documentation |
This function binds columns containing entity or concept embeddings to a data
frame. The entity embeddings are derived from the scispacy package, and the
concept embeddings are derived from the
dataset_cui2vec_embeddings
dataset included with this package.
Description
The embeddings are derived from Andrew Beam's cui2vec R package.
Usage
bind_clinspacy_embeddings(
clinspacy_output,
df,
type = "scispacy",
df_id = NULL,
subset = "is_negated == FALSE"
)
Arguments
clinspacy_output |
A data.frame or file name containing the output from
|
df |
The data.frame to which you would like to bind the output of
|
type |
The type of embeddings to return. One of |
df_id |
The name of the |
subset |
Logical criteria represented as a string by which the
|
Details
Citation
Beam, A.L., Kompa, B., Schmaltz, A., Fried, I., Griffin, W, Palmer, N.P., Shi, X., Cai, T., and Kohane, I.S.,, 2019. Clinical Concept Embeddings Learned from Massive Sources of Multimodal Medical Data. arXiv preprint arXiv:1804.01486.
License
The cui2vec data is made available under a CC BY 4.0 license. The only change made to the original dataset is the renaming of columns.
Value
A data frame containing the original data frame as well as the concept embeddings. For scispacy embeddings, this returns 200 columns of embeddings. For cui2vec embeddings, this returns 500 columns of embedings. The resulting data frame can be used to train a machine learning model.
Examples
## Not run:
mtsamples <- dataset_mtsamples()
mtsamples[1:5,] %>%
clinspacy(df_col = 'description', return_scispacy_embeddings = TRUE) %>%
bind_clinspacy_embeddings(mtsamples[1:5,])
## End(Not run)