misc_format_check {dbGaPCheckup} | R Documentation |
Miscellaneous Format Check
Description
This function checks miscellaneous dbGaP formatting requirements to ensure (1) no empty variable names; (2) no duplicate variable names; (3) variable names do not contain "dbgap"; (4) there are no duplicate column names in the dictionary; and (5) column names falling after VALUES
column are unnamed.
Usage
misc_format_check(DD.dict, DS.data, verbose = TRUE)
Arguments
DD.dict |
Data dictionary. |
DS.data |
Data set. |
verbose |
When TRUE, the function prints the Message out, as well as more detailed information about which formatting checks failed. |
Details
Note that this check will return a WARNING for Check #5 depending on how the data set is read into R. Depending on the method used, R will automatically fill in column names after VALUES with "...col_number". This is allowed by the package, but it is NOT allowed by dbGaP, so please use caution if you write out a data set after making adjustments directly in R.
Value
Tibble, returned invisibly, containing: (1) Time (time stamp); (2) Name (name of the function); (3) Status (Passed/Failed); (4) Message (A copy of the message the function printed out); (5) Information (Names of variables that fail one of these checks).
Examples
# Example 1: Fail check
data(ExampleJ)
misc_format_check(DD.dict.J, DS.data.J)
print(misc_format_check(DD.dict.J, DS.data.J, verbose=FALSE))
# Example 2: Pass check
data(ExampleA)
misc_format_check(DD.dict.A, DS.data.A)
print(misc_format_check(DD.dict.A, DS.data.A, verbose=FALSE))