rec_with_table {cchsflow} | R Documentation |
Recode with Table
Description
Recode with Table is responsible for recoding values of a dataset based on the specifications in variable_details.
Usage
rec_with_table(
data,
variables = NULL,
database_name = NULL,
variable_details = NULL,
else_value = NA,
append_to_data = FALSE,
log = FALSE,
notes = TRUE,
var_labels = NULL,
custom_function_path = NULL,
attach_data_name = FALSE
)
Arguments
data |
A dataframe containing the variables to be recoded. Can also be a list of dataframes |
variables |
character vector containing variable names to recode or a variables csv containing additional variable info |
database_name |
String, the name of the dataset containing the variables to be recoded. Can also be a vector of strings if data is a list |
variable_details |
A dataframe containing the specifications (rules) for recoding. |
else_value |
Value (string, number, integer, logical or NA) that is used to replace any values that are outside the specified ranges (no rules for recoding). |
append_to_data |
Logical, if |
log |
Logical, if |
notes |
Logical, if |
var_labels |
labels vector to attach to variables in variables |
custom_function_path |
path to location of the function to load |
attach_data_name |
to attach name of database to end table |
Details
The variable_details dataframe needs the following variables to function:
- variable
name of new (mutated) variable that is recoded
- toType
type the variable is being recoded to cat = categorical, cont = continuous
- databaseStart
name of dataframe with original variables to be recoded
- variableStart
name of variable to be recoded
- fromType
variable type of start variable. cat = categorical or factor variable cont = continuous variable (real number or integer)
- recTo
Value to recode to
- recFrom
Value/range being recoded from
Each row in variable_details comprises one category in a newly transformed variable. The rules for each category the new variable are a string in recFrom and value in recTo. These recode pairs are the same syntax as sjmisc::rec(), except in sjmisc::rec() the pairs are a string for the function attribute rec =, separated by '='. For example in rec_w_table variable_details$recFrom = 2; variable_details$recTo = 4 is the same as sjmisc::rec(rec = "2=4"). the pairs are obtained from the RecFrom and RecTo columns
- recode pairs
each recode pair is row. see above example or PBC-variableDetails.csv
- multiple values
multiple old values that should be recoded into a new single value may be separated with comma, e.g. recFrom = "1,2"; recTo = 1
- value range
a value range is indicated by a colon, e.g. recFrom= "1:4"; recTo = 1 (recodes all values from 1 to 4 into 1)
- value range for doubles
for double vectors (with fractional part), all values within the specified range are recoded; e.g. recFrom = "1:2.5'; recTo = 1 recodes 1 to 2.5 into 1, but 2.55 would not be recoded (since it's not included in the specified range)
- "min" and "max"
minimum and maximum values are indicates by min (or lo) and max (or hi), e.g. recFrom = "min:4"; recTo = 1 (recodes all values from minimum values of x to 4 into 1)
- "else"
all other values, which have not been specified yet, are indicated by else, e.g. recFrom = "else"; recTo = NA (recode all other values (not specified in other rows) to "NA")
- "copy"
the "else"-token can be combined with copy, indicating that all remaining, not yet recoded values should stay the same (are copied from the original value), e.g. recFrom = "else"; recTo = "copy"
- NA's
NA values are allowed both as old and new value, e.g. recFrom "NA"; recTo = 1. or "recFrom = "3:5"; recTo = "NA" (recodes all NA into 1, and all values from 3 to 5 into NA in the new variable)
Value
a dataframe that is recoded according to rules in variable_details.
Examples
library(cchsflow)
bmi2001 <- rec_with_table(
data = cchs2001_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
head(bmi2001)
bmi2011_2012 <- rec_with_table(
data = cchs2011_2012_p, c(
"HWTGHTM",
"HWTGWTK", "HWTGBMI_der"
)
)
tail(bmi2011_2012)
combined_bmi <- bind_rows(bmi2001, bmi2011_2012)
head(combined_bmi)
tail(combined_bmi)