dstar {riskclustr} | R Documentation |
Estimate the incremental explained risk variation in a case-only study
Description
dstar
estimates the incremental explained risk variation
across a set of pre-specified disease subtypes in a case-only study.
The highest frequency level of label is used as the reference level,
for stability.
This function takes the name of the disease subtype variable, the number
of disease subtypes, a list of risk factors, and a wide case-only dataset,
and does the needed
transformation on the dataset to get the correct format. Then the polytomous
logistic regression model is fit using mlogit
,
and D* is calculated based on the resulting risk predictions.
Usage
dstar(label, M, factors, data)
Arguments
label |
the name of the subtype variable in the data. This should be a
numeric variable with values 0 through M, where 0 indicates control subjects.
Must be supplied in quotes, e.g. |
M |
is the number of subtypes. For M>=2. |
factors |
a list of the names of the binary or continuous risk factors.
For binary risk factors the lowest level will be used as the reference level.
e.g. |
data |
the name of the case-only dataframe that contains the relevant variables. |
References
Begg, C. B., Seshan, V. E., Zabor, E. C., Furberg, H., Arora, A., Shen, R., . . . Hsieh, J. J. (2014). Genomic investigation of etiologic heterogeneity: methodologic challenges. BMC Med Res Methodol, 14, 138.
Examples
# Exclude controls from data as this is a case-only calculation
dstar(
label = "subtype",
M = 4,
factors = list("x1", "x2", "x3"),
data = subtype_data[subtype_data$subtype > 0, ]
)