ValidateOutcomeDataset {NlsyLinks} | R Documentation |
Validates the schema of datasets containing outcome variables.
Description
The NlsyLinks handles a lot of the plumbing code needed to transform extracted NLSY datasets into a format that statistical routines can interpret. In some cases, a dataset of measured variables is needed, with one row per subject. This function validates the measured/outcome dataset, to ensure it posses an interpretable schema. For a specific list of the requirements, see Details
below.
Usage
ValidateOutcomeDataset(dsOutcome, outcomeNames)
Arguments
dsOutcome |
A base::data.frame with the measured variables |
outcomeNames |
The column names of the measure variables that eventually will be used by a statistical procedure. |
Details
The dsOutcome
parameter must:
Have a non-missing value.
Contain at least one row.
Contain a column called 'SubjectTag' (case sensitive).
Have the SubjectTag column containing only positive numbers.
Have the SubjectTag column where all values are unique (ie, two rows/subjects cannot have the same value).
The outcomeNames
parameter must:
Have a non-missing value
Contain only column names that are present in the
dsOutcome
data frame.
Value
Returns TRUE
if the validation passes.
Returns an error (and associated descriptive message) if it false.
Author(s)
Will Beasley
Examples
library(NlsyLinks) # Load the package into the current R session.
ds <- ExtraOutcomes79
outcomeNames <- c("MathStandardized", "WeightZGenderAge")
ValidateOutcomeDataset(dsOutcome = ds, outcomeNames = outcomeNames) # Returns TRUE.
outcomeNamesBad <- c("MathMisspelled", "WeightZGenderAge")
# ValidateOutcomeDataset(dsOutcome=ds, outcomeNames=outcomeNamesBad) #Throws error.