genOrdCat {simstudy} | R Documentation |
Generate ordinal categorical data
Description
Ordinal categorical data is added to an existing data set.
Correlations can be added via correlation matrix or rho
and corstr
.
Usage
genOrdCat(
dtName,
adjVar = NULL,
baseprobs,
catVar = "cat",
asFactor = TRUE,
idname = "id",
prefix = "grp",
rho = 0,
corstr = "ind",
corMatrix = NULL,
npVar = NULL,
npAdj = NULL
)
Arguments
dtName |
Name of complete data set |
adjVar |
Adjustment variable name in dtName - determines logistic shift. This is specified assuming a cumulative logit link. |
baseprobs |
Baseline probability expressed as a vector or matrix of
probabilities. The values (per row) must sum to <= 1. If |
catVar |
Name of the new categorical field. Defaults to "cat". Can be a
character vector with a name for each new variable defined via |
asFactor |
If |
idname |
Name of the id column in |
prefix |
A string. The names of the new variables will be a concatenation of the prefix and a sequence of integers indicating the variable number. |
rho |
Correlation coefficient, -1 < rho < 1. Use if corMatrix is not provided. |
corstr |
Correlation structure of the variance-covariance matrix defined by sigma and rho. Options include "ind" for an independence structure, "cs" for a compound symmetry structure, and "ar1" for an autoregressive structure. |
corMatrix |
Correlation matrix can be entered directly. It must be
symmetrical and positive definite. It is not a required field; if a matrix is
not provided, then a structure and correlation coefficient rho must be
specified. (The matrix created via |
npVar |
Vector of variable names that indicate which variables are to violate the proportionality assumption. |
npAdj |
Matrix with a row for each npVar and a column for each category. Each value represents the deviation from the proportional odds assumption on the logistic scale. |
Value
Original data.table with added categorical field.
Examples
# Ordinal Categorical Data ----
def1 <- defData(
varname = "male",
formula = 0.45, dist = "binary", id = "idG"
)
def1 <- defData(def1,
varname = "z",
formula = "1.2*male", dist = "nonrandom"
)
def1
## Generate data
set.seed(20)
dx <- genData(1000, def1)
probs <- c(0.40, 0.25, 0.15)
dx <- genOrdCat(dx,
adjVar = "z", idname = "idG", baseprobs = probs,
catVar = "grp"
)
dx
# Correlated Ordinal Categorical Data ----
baseprobs <- matrix(c(
0.2, 0.1, 0.1, 0.6,
0.7, 0.2, 0.1, 0,
0.5, 0.2, 0.3, 0,
0.4, 0.2, 0.4, 0,
0.6, 0.2, 0.2, 0
),
nrow = 5, byrow = TRUE
)
set.seed(333)
dT <- genData(1000)
dX <- genOrdCat(dT,
adjVar = NULL, baseprobs = baseprobs,
prefix = "q", rho = .125, corstr = "cs", asFactor = FALSE
)
dX
dM <- data.table::melt(dX, id.vars = "id")
dProp <- dM[, prop.table(table(value)), by = variable]
dProp[, response := c(1:4, 1:3, 1:3, 1:3, 1:3)]
data.table::dcast(dProp, variable ~ response,
value.var = "V1", fill = 0
)
# proportional odds assumption violated
d1 <- defData(varname = "rx", formula = "1;1", dist = "trtAssign")
d1 <- defData(d1, varname = "z", formula = "0 - 1.2*rx", dist = "nonrandom")
dd <- genData(1000, d1)
baseprobs <- c(.4, .3, .2, .1)
npAdj <- c(0, 1, 0, 0)
dn <- genOrdCat(
dtName = dd, adjVar = "z",
baseprobs = baseprobs,
npVar = "rx", npAdj = npAdj
)