| factorize_par {Colossus} | R Documentation |
Splits a parameter into factors in parallel
Description
factorize_par uses user provided list of columns to define new parameter for each unique value and update the data.table.
Not for interaction terms
Usage
factorize_par(
df,
col_list,
verbose = FALSE,
nthreads = as.numeric(detectCores())
)
Arguments
df |
a data.table containing the columns of interest |
col_list |
an array of column names that should have factor terms defined |
verbose |
boolean to control if additional information is printed to the console, also accepts 0/1 integer |
nthreads |
number of threads to use, do not use more threads than available on your machine |
Value
returns a list with two named fields. df for the updated dataframe, and cols for the new column names
See Also
Other Data Cleaning Functions:
Check_Dupe_Columns(),
Check_Trunc(),
Correct_Formula_Order(),
Date_Shift(),
Def_Control(),
Def_Control_Guess(),
Def_model_control(),
Def_modelform_fix(),
Joint_Multiple_Events(),
Replace_Missing(),
Time_Since(),
factorize(),
gen_time_dep(),
interact_them()
Examples
library(data.table)
a <- c(0,1,2,3,4,5,6)
b <- c(1,2,3,4,5,6,7)
c <- c(0,1,2,1,0,1,0)
df <- data.table::data.table("a"=a,"b"=b,"c"=c)
col_list <- c("c")
val <- factorize_par(df,col_list,FALSE,2)
df <- val$df
new_col <- val$cols