separate_text {tidypmc}R Documentation

Separate all matching text into multiple rows

Description

Separate all matching text into multiple rows

Usage

separate_text(txt, pattern, column = "text")

Arguments

txt

a tibble, usually results from pmc_text

pattern

either a regular expression or a vector of words to find in text

column

column name, default "text"

Value

a tibble

Note

passed to grepl and str_extract_all

Author(s)

Chris Stubben

Examples

# doc <- pmc_xml("PMC2231364")
doc <- xml2::read_xml(system.file("extdata/PMC2231364.xml",
        package = "tidypmc"))
txt <- pmc_text(doc)
separate_text(txt, "[ATCGN]{5,}")
separate_text(txt, "\\([A-Z]{3,6}s?\\)")
# pattern can be a vector of words
separate_text(txt, c("hmu", "ybt", "yfe", "yfu"))
# wrappers for separate_text with extra step to expand matched ranges
separate_refs(txt)
separate_genes(txt)
separate_tags(txt, "YPO")


[Package tidypmc version 1.7 Index]