add_data_sufficiency {midfieldr} | R Documentation |
Determine data sufficiency for every student
Description
Add a column to a data frame of student-level records that labels each row for inclusion or exclusion based on data sufficiency near the upper and lower bounds of an institution's data range.
Usage
add_data_sufficiency(dframe, midfield_term = term)
Arguments
dframe |
Working data frame of student-level records
to which data-sufficiency columns are to be added. Required variables
are |
midfield_term |
MIDFIELD |
Details
The time span of MIDFIELD term data varies by institution, each having their own lower and upper bounds. For some student records, being at or near these bounds creates unavoidable ambiguity when trying to assess degree completion. Such records must be identified and in most cases excluded to prevent false summary counts.
Value
A data frame in data.table
format with the following
properties: rows are preserved; columns are preserved with the exception
that columns added by the function overwrite existing columns of the
same name (if any); grouping structures are not preserved. The added
columns are:
term_i
Character. Initial term of a student's longitudinal record, encoded YYYYT. Not overwritten if present in
dframe.
lower_limit
Character. Initial term of an institution's data range, encoded YYYYT
upper_limit
Character. Final term of an institution's data range, encoded YYYYT
data_sufficiency
Character. Label each observation for inclusion or exclusion based on data sufficiency. Possible values are:
include
, indicating that available data are sufficient for estimating timely completion;exclude-upper
, indicating that data are insufficient at the upper limit of a data range; andexclude
-lower, indicating that data are insufficient at the lower limit.
See Also
Other add_*:
add_completion_status()
,
add_timely_term()
Examples
# Start with an excerpt from the student data set
dframe <- toy_student[1:10, .(mcid)]
# Timely term column is required to add data sufficiency column
dframe <- add_timely_term(dframe, midfield_term = toy_term)
# Add data sufficiency column
add_data_sufficiency(dframe, midfield_term = toy_term)
# Existing data_sufficiency column, if any, is overwritten
dframe[, data_sufficiency := NA_character_]
add_data_sufficiency(dframe, midfield_term = toy_term)