dt.define.variable {DTwrappers} | R Documentation |
dt.define.variable
Description
This method allows a user to add a new variable to an existing data.frame or data.table. It can also be used to update previously defined variables. It is built as a wrapper function of data.table's method of defining new variables by reference. The new values can be stated either through a statement of the calculation or by directly providing a vector of values. These updates can also be performed on a subset of the data by incorporating a filter. Options also exist to return a data.table coding statement (result = "code") for educational purposes or both the result and the code together (result = "all"). For examples, please see the vignette.
Usage
dt.define.variable(
dt.name,
variable.name,
the.values,
specification = "by.expression",
the.filter = NULL,
grouping.variables = NULL,
sortby.group = TRUE,
return.as = "result",
envir = .GlobalEnv,
...
)
Arguments
dt.name |
a character value specifying the name of a data.frame or data.table object to select data from. A variable called dat should be referred to with dt.name = "dat" when using the function. |
variable.name |
a character value specifying the name of the new column. |
the.values |
a vector or character value. When specified as a vector, this should contain the values of the new column. When specified as a character value, it should include a functional form that specifies how to calculate the new values. See the specification parameter for more details. |
specification |
A character value. When specification = "by.value", the new variable will be defined in terms of the vector the.values. Otherwise the new variable is specified in a functional form, e.g. the.values = "rnorm(n = 3)". |
the.filter |
a character value, logical vector, or expression stating the logical operations used to filter the data. See create.filter.expression for details. The filtering step will be applied prior to generating the counts. Defaults to NULL unless otherwise specified. |
grouping.variables |
A character or numeric vector specifying the variables to perform the calculations on. For character vectors, the values may be either column names of the data or calculations based upon them (see the vignette for examples). For numeric vectors, only the values of unique(floor(grouping.variables)) that are in 1:ncol() of your data will be used. Then these indices will be mapped to the corresponding column names from the data. When NULL, no grouping will be performed. |
sortby.group |
A logical value specifying whether the grouping should be sorted (TRUE, the default value) or as is (FALSE). |
return.as |
a character value specifying what output should be returned. return.as = "result" provides the table of counts. return.as = "code" provides a data.table coding statement that can generate the table of counts. return.as = "all" provides both the resulting table and the code. |
envir |
the environment in which the code would be evaluated; .GlobalEnv by default. |
... |
other additional arguments if needed |
Value
Depending on the value of return.as, the output will be a) a character value (return.as = 'code'), b) a coding output, typically a data.table (return.as = 'result'), or c) a list containing both the code and output (return.as = 'all')
Note
the data.frame dat will be converted to a data.table object to facilitate adding the new column by reference (e.g. efficiently with regard to the usage of memory)
Source
DTwrappers::create.dt.statement
DTwrappers::eval.dt.statement