filter_cip {midfieldr} | R Documentation |
Subset rows that include matches to search strings
Description
Subset a CIP data frame, retaining rows that match or partially match a vector of character strings. Columns are not subset unless selected in an optional argument.
Usage
filter_cip(keep_text = NULL, ..., drop_text = NULL, cip = NULL, select = NULL)
Arguments
keep_text |
Character vector of search text for retaining rows,
not case-sensitive. Can be empty if |
... |
Not used, force later arguments to be used by name |
drop_text |
Optional character vector of search text for dropping rows, default NULL. |
cip |
Data frame to be searched. Default |
select |
Optional character vector of column names to return, default all columns. |
Details
Search terms can include regular expressions. Uses grepl()
, therefore
non-character columns (if any) that can be coerced to character are also
searched for matches. Columns are subset by the values in select
after the
search concludes.
If none of the optional arguments are specified, the function returns the original data frame.
Value
A data.table
subset of cip
with the following properties:
Rows matching elements of
keep_text
but excluding rows matching elements ofdrop_text
.All columns or those specified by
select
.Grouping structures are not preserved.
Examples
# Subset using keywords
filter_cip(keep_text = "engineering")
# Multiple passes to narrow the results
first_pass <- filter_cip("civil")
second_pass <- filter_cip("engineering", cip = first_pass)
filter_cip(drop_text = "technology", cip = second_pass)
# drop_text argument, when used, must be named
filter_cip("civil engineering", drop_text = "technology")
# Subset using numerical codes
filter_cip(keep_text = c("050125", "160501"))
# Subset using regular expressions
filter_cip(keep_text = "^54")
filter_cip(keep_text = c("^1407", "^1408"))
# Select columns
filter_cip(keep_text = "^54", select = c("cip6", "cip4name"))