select_required {midfieldr} | R Documentation |
Select required midfieldr variables
Description
Subset a data frame, selecting columns by matching or partially matching a vector of character strings. A convenience function to reduce the dimensions of a MIDFIELD data table at the start of a session by selecting only those columns typically required by other midfieldr functions. Particularly useful in interactive sessions when viewing the data tables at various stages of an analysis.
Usage
select_required(midfield_x, ..., select_add = NULL)
Arguments
midfield_x |
Data frame from which columns are selected, typically
|
... |
Not used, force later arguments to be used by name. |
select_add |
Optional character vector of search terms to add to the
default vector given by |
Details
Several midfieldr functions are designed to operate on one or more of the
MIDFIELD data tables, usually student
, term
, or degree.
This family of
functions requires only a small subset of available variables, e.g., mcid
,
cip6
, or term.
The required columns are built in to the function. The
select
argument is used to add search strings to the default vector.
The column names of midfield_x
are searched for matches or partial matches
using grep()
, thus search terms can include regular expressions. Variables
with names that match or partially match the search terms are returned; all
other columns are dropped. Rows are unaffected. Search terms not present are
silently ignored.
One could use this function to select columns from a non-MIDFIELD data frame, but with no benefit to the user—conventional column selection syntax is better suited to that task. Here, we specialize the column selection to serve midfieldr functions.
Value
A data.table
with the following properties:
Rows are not modified.
Columns with names that match or partially match the values in
select.
Grouping structures are not preserved.
Examples
# Default character vector for selecting columns
default_cols<- c("mcid", "institution", "race", "sex", "^term", "cip6", "level")
# Create one string separated by OR
search_pattern <- paste(default_cols, collapse = "|")
# Find names of columns matching or partially matching
x <- select_required(toy_student)
names(x)
grepl(search_pattern, names(x))
x <- select_required(toy_term)
names(x)
grepl(search_pattern, names(x))
x <- select_required(toy_degree)
names(x)
grepl(search_pattern, names(x))
x <- select_required(toy_course)
names(x)
grepl(search_pattern, names(x))
# Adding search terms
x <- select_required(toy_course, select_add = c("abbrev", "number", "grade"))
names(x)