factorize_par {Colossus} | R Documentation |
Splits a parameter into factors in parallel
Description
factorize_par
uses user provided list of columns to define new parameter for each unique value and update the data.table.
Not for interaction terms
Usage
factorize_par(
df,
col_list,
verbose = FALSE,
nthreads = as.numeric(detectCores())
)
Arguments
df |
a data.table containing the columns of interest |
col_list |
an array of column names that should have factor terms defined |
verbose |
boolean to control if additional information is printed to the console, also accepts 0/1 integer |
nthreads |
number of threads to use, do not use more threads than available on your machine |
Value
returns a list with two named fields. df for the updated dataframe, and cols for the new column names
See Also
Other Data Cleaning Functions:
Check_Dupe_Columns()
,
Check_Trunc()
,
Correct_Formula_Order()
,
Date_Shift()
,
Def_Control()
,
Def_Control_Guess()
,
Def_model_control()
,
Def_modelform_fix()
,
Joint_Multiple_Events()
,
Replace_Missing()
,
Time_Since()
,
factorize()
,
gen_time_dep()
,
interact_them()
Examples
library(data.table)
a <- c(0,1,2,3,4,5,6)
b <- c(1,2,3,4,5,6,7)
c <- c(0,1,2,1,0,1,0)
df <- data.table::data.table("a"=a,"b"=b,"c"=c)
col_list <- c("c")
val <- factorize_par(df,col_list,FALSE,2)
df <- val$df
new_col <- val$cols