categorical {LSTbook} | R Documentation |
Helpers for specifying nodes in simulations
Description
Helpers for specifying nodes in simulations
Mix two variables together. The output will have the specified R-squared with var1 and variance one.
Evaluate an expression separately for each case
Usage
categorical(n = 5, ..., exact = TRUE)
cat2value(variable, ...)
bernoulli(n = 0, logodds = NULL, prob = 0.5, labels = NULL)
mix_with(signal, noise = NULL, R2 = 0.5, var = 1, exact = FALSE)
each(ex)
block_by(block_var, levels = c("treatment", "control"), show_block = FALSE)
random_levels(n, k = NULL, replace = FALSE)
Arguments
n |
The symbol standing for the number of rows in the data frame to be generated
by |
exact |
if |
variable |
a categorical variable |
logodds |
Numerical vector used to generate bernouilli trials. Can be any real number. |
prob |
An alternative to |
labels |
Character vector: names for categorical levels, also used to replace 0 and 1 in bernouilli() |
signal |
The part of the mixture that will be correlated with the output. |
noise |
The rest of the mixture. This will be uncorrelated with the output only if you specify it as pure noise. |
R2 |
The target R-squared. |
var |
The target variance. |
ex |
an expression potentially involving other variables. |
block_var |
Which variable to use for blocking |
levels |
Character vector giving names to the blocking levels |
show_block |
Logical. If |
k |
Number of distinct levels |
replace |
if |
... |
assignments of values to the names in |
Details
datasim_make()
constructs a simulation
which can then be run with datasim_run()
. Each argument to
datasim_make()
specifies one node of the simulation using an
assignment-like syntax such as y <- 3*x + 2 + rnorm(n)
. The datasim
helpers documented here are for use on the right-hand side of the specification
of a node. They simplify potentially complex operations such as blocking, creation
of random categorical methods, translation from categorical to numerical values, etc.
The target R-squared and variance will be achieved only
if exact=TRUE
or the sample size goes to infinity.
Value
A numerical or categorical vector which will be assembled into
a data frame by datasim_run()
Examples
Demo <- datasim_make(
g <- categorical(n, a=2, b=1, c=0.5),
x <- cat2value(g, a=-1.7, b=0.1, c=1.2),
y <- bernoulli(logodds = x, labels=c("no", "yes")),
z <- random_levels(n, k=4),
w <- mix_with(x, noise=rnorm(n), R2=0.75, var=1),
treatment <- block_by(w),
dice <- each(rnorm(1, sd = abs(w)))
)