rec_with_table {cchsflow}R Documentation

Recode with Table

Description

Recode with Table is responsible for recoding values of a dataset based on the specifications in variable_details.

Usage

rec_with_table(
  data,
  variables = NULL,
  database_name = NULL,
  variable_details = NULL,
  else_value = NA,
  append_to_data = FALSE,
  log = FALSE,
  notes = TRUE,
  var_labels = NULL,
  custom_function_path = NULL,
  attach_data_name = FALSE
)

Arguments

data

A dataframe containing the variables to be recoded. Can also be a list of dataframes

variables

character vector containing variable names to recode or a variables csv containing additional variable info

database_name

String, the name of the dataset containing the variables to be recoded. Can also be a vector of strings if data is a list

variable_details

A dataframe containing the specifications (rules) for recoding.

else_value

Value (string, number, integer, logical or NA) that is used to replace any values that are outside the specified ranges (no rules for recoding).

append_to_data

Logical, if TRUE (default), recoded variables will be appended to the data.

log

Logical, if FALSE (default), a log of recoding will not be printed.

notes

Logical, if FALSE (default), will not print the content inside the 'Note“ column of the variable being recoded.

var_labels

labels vector to attach to variables in variables

custom_function_path

path to location of the function to load

attach_data_name

to attach name of database to end table

Details

The variable_details dataframe needs the following variables to function:

variable

name of new (mutated) variable that is recoded

toType

type the variable is being recoded to cat = categorical, cont = continuous

databaseStart

name of dataframe with original variables to be recoded

variableStart

name of variable to be recoded

fromType

variable type of start variable. cat = categorical or factor variable cont = continuous variable (real number or integer)

recTo

Value to recode to

recFrom

Value/range being recoded from

Each row in variable_details comprises one category in a newly transformed variable. The rules for each category the new variable are a string in recFrom and value in recTo. These recode pairs are the same syntax as sjmisc::rec(), except in sjmisc::rec() the pairs are a string for the function attribute rec =, separated by '='. For example in rec_w_table variable_details$recFrom = 2; variable_details$recTo = 4 is the same as sjmisc::rec(rec = "2=4"). the pairs are obtained from the RecFrom and RecTo columns

recode pairs

each recode pair is row. see above example or PBC-variableDetails.csv

multiple values

multiple old values that should be recoded into a new single value may be separated with comma, e.g. recFrom = "1,2"; recTo = 1

value range

a value range is indicated by a colon, e.g. recFrom= "1:4"; recTo = 1 (recodes all values from 1 to 4 into 1)

value range for doubles

for double vectors (with fractional part), all values within the specified range are recoded; e.g. recFrom = "1:2.5'; recTo = 1 recodes 1 to 2.5 into 1, but 2.55 would not be recoded (since it's not included in the specified range)

"min" and "max"

minimum and maximum values are indicates by min (or lo) and max (or hi), e.g. recFrom = "min:4"; recTo = 1 (recodes all values from minimum values of x to 4 into 1)

"else"

all other values, which have not been specified yet, are indicated by else, e.g. recFrom = "else"; recTo = NA (recode all other values (not specified in other rows) to "NA")

"copy"

the "else"-token can be combined with copy, indicating that all remaining, not yet recoded values should stay the same (are copied from the original value), e.g. recFrom = "else"; recTo = "copy"

NA's

NA values are allowed both as old and new value, e.g. recFrom "NA"; recTo = 1. or "recFrom = "3:5"; recTo = "NA" (recodes all NA into 1, and all values from 3 to 5 into NA in the new variable)

Value

a dataframe that is recoded according to rules in variable_details.

Examples

library(cchsflow)
bmi2001 <- rec_with_table(
  data = cchs2001_p, c(
    "HWTGHTM",
    "HWTGWTK", "HWTGBMI_der"
  )
)

head(bmi2001)

bmi2011_2012 <- rec_with_table(
  data = cchs2011_2012_p,  c(
    "HWTGHTM",
    "HWTGWTK", "HWTGBMI_der"
  )
)

tail(bmi2011_2012)

combined_bmi <- bind_rows(bmi2001, bmi2011_2012)
head(combined_bmi)
tail(combined_bmi)

[Package cchsflow version 2.1.0 Index]