eh_test_marker {riskclustr} | R Documentation |
Test for etiologic heterogeneity of risk factors according to individual disease markers in a case-control study
Description
eh_test_marker
takes a list of individual disease
markers,
a list of risk factors, a variable name denoting case versus control status,
and a dataframe, and returns results related to the question of
whether each risk factor differs across levels of the disease subtypes and
the question of whether each risk factor differs across levels of each
individual disease marker of which the disease subtypes are comprised.
Input is a dataframe that contains the individual disease markers, the risk
factors of interest, and an indicator of case or control status.
The disease markers must be binary and must have levels
0 or 1 for cases. The disease markers should be left missing for control
subjects. For categorical disease markers, a reference level should be
selected
and then indicator variables for each remaining level of the disease marker
should be created. Risk factors can be either binary or continuous. For
categorical risk factors, a reference level should be selected and then
indicator variables for each remaining level of the risk factor should be
created.
Usage
eh_test_marker(markers, factors, case, data, digits = 2)
Arguments
markers |
a list of the names of the binary disease markers.
Each must have levels 0 or 1 for case subjects. This value will be missing
for all control subjects. e.g. |
factors |
a list of the names of the binary or continuous risk factors.
For binary risk factors the lowest level will be used as the reference level.
e.g. |
case |
denotes the variable that contains each subject's status as a
case or control. This value should be 1 for cases and 0 for controls.
Argument must be supplied in quotes, e.g. |
data |
the name of the dataframe that contains the relevant variables. |
digits |
the number of digits to round the odds ratios and associated confidence intervals, and the estimates and associated standard errors. Defaults to 2. |
Value
Returns a list.
beta
is a matrix containing the raw estimates from the
polytomous logistic regression model fit with mlogit
with a row for each risk factor and a column for each disease subtype.
beta_se
is a matrix containing the raw standard errors from the
polytomous logistic regression model fit with mlogit
with a row for each risk factor and a column for each disease subtype.
eh_pval
is a vector of unformatted p-values for testing whether each
risk factor differs across the levels of the disease subtype.
gamma
is a matrix containing the estimated disease marker parameters,
obtained as linear combinations of the beta
estimates,
with a row for each risk factor and a column for each disease marker.
gamma_se
is a matrix containing the estimated disease marker
standard errors, obtained based on a transformation of the beta
standard errors, with a row for each risk factor and a column for each
disease marker.
gamma_p
is a matrix of p-values for testing whether each risk factor
differs across levels of each disease marker, with a row for each risk
factor and a column for each disease marker.
or_ci_p
is a dataframe with the odds ratio (95\
factor/subtype combination, as well as a column of formatted etiologic
heterogeneity p-values.
beta_se_p
is a dataframe with the estimates (SE) for
each risk factor/subtype combination, as well as a column of formatted
etiologic heterogeneity p-values.
gamma_se_p
is a dataframe with disease marker estimates (SE) and
their associated p-values.
Author(s)
Emily C Zabor zabore@mskcc.org
Examples
# Run for two binary tumor markers, which will combine to form four subtypes
eh_test_marker(
markers = list("marker1", "marker2"),
factors = list("x1", "x2", "x3"),
case = "case",
data = subtype_data,
digits = 2
)