node_multinomial {simDAG}R Documentation

Simulate a Node Using Multinomial Regression

Description

Data from the parents is used to generate the node using multinomial regression by predicting the covariate specific probability of each class and sampling from a multinomial distribution accordingly.

Usage

node_multinomial(data, parents, betas, intercepts,
                 labels=NULL, coerce2factor=TRUE,
                 return_prob=FALSE)

Arguments

data

A data.table (or something that can be coerced to a data.table) containing all columns specified by parents.

parents

A character vector specifying the names of the parents that this particular child node has.

betas

A numeric matrix with length(parents) columns and one row for each class that should be simulated, specifying the causal beta coefficients used to generate the node.

intercepts

A numeric vector with one entry for each class that should be simulated, specifying the intercepts used to generate the node.

labels

An optional character vector giving the factor levels of the generated classes. If NULL (default), the integers are simply used as factor levels.

coerce2factor

A single logical value specifying whether to return the drawn events as a factor (default) or as integers.

return_prob

Either TRUE or FALSE (default). Specifies whether to return the matrix of class probabilities or not. If you are using this function inside of a node call, you cannot set this to TRUE because it will return a matrix. It may, however, be useful when using this function by itself, or as a probability generating function for the node_competing_events function.

Details

This function works essentially like the node_binomial function. First, the matrix of betas coefficients is used in conjunction with the values defined in the parents nodes and the intercepts to calculate the expected subject-specific probabilities of occurrence for each possible category. This is done using the standard multinomial regression equations. Using those probabilities in conjunction with the rcategorical function, a single one of the possible categories is drawn for each individual.

Since this function produces categorical output (as it should), it may be difficult to use this node type as a parent for other nodes. Nevertheless, it is of course possible using a user-defined node type (see node_custom for some infos on how to define those).

Value

Returns a vector of length nrow(data). Depending on the used arguments, this vector may be of type character, numeric of factor. If return_prob was used it instead returns a numeric matrix containing one column per possible event and nrow(data) rows.

Author(s)

Robin Denz

See Also

empty_dag, node, node_td, sim_from_dag, sim_discrete_time

Examples

library(simDAG)

set.seed(3345235)

dag <- empty_dag() +
  node("age", type="rnorm", mean=50, sd=4) +
  node("sex", type="rbernoulli", p=0.5) +
  node("UICC", type="multinomial", parents=c("sex", "age"),
       betas=matrix(c(0.2, 0.4, 0.1, 0.5, 1.1, 1.2), ncol=2),
       intercepts=1)

sim_dat <- sim_from_dag(dag=dag, n_sim=100)

[Package simDAG version 0.1.2 Index]