calf_subset {CALF} | R Documentation |
calf_subset
Description
Runs Coarse Approximation Linear Function on a random subset of the data provided, resulting in the same proportion applied to case and control, when applicable.
Usage
calf_subset(
data,
nMarkers,
proportion = 0.8,
targetVector,
times = 1,
optimize = "pval",
verbose = FALSE
)
Arguments
data |
Matrix or data frame. First column must contain case/control dummy coded variable (if targetVector = "binary"). Otherwise, first column must contain real number vector corresponding to selection variable (if targetVector = "nonbinary"). All other columns contain relevant markers. |
nMarkers |
Maximum number of markers to include in creation of sum. |
proportion |
Numeric. A value between 0 and 1 indicating the proportion of cases and controls to use in analysis (if targetVector = "binary"). If targetVector = "nonbinary", this is just a proportion of the full sample. Used to evaluate robustness of solution. Defaults to 0.8. |
targetVector |
Indicate "binary" for target vector with two options (e.g., case/control). Indicate "nonbinary" for target vector with real numbers. |
times |
Numeric. Indicates the number of replications to run with randomization. |
optimize |
Criteria to optimize if targetVector = "binary." Indicate "pval" to optimize the p-value corresponding to the t-test distinguishing case and control. Indicate "auc" to optimize the AUC. |
verbose |
Logical. Indicate TRUE to print activity at each iteration to console. Defaults to FALSE. |
Value
A data frame containing the chosen markers and their assigned weight (-1 or 1)
The optimal AUC, pval, or correlation for the classification. If multiple replications are requested, a data.frame containing all optimized values across all replications is returned.
aucHist A histogram of the AUCs across replications, if applicable.
Examples
calf_subset(data = CaseControl, nMarkers = 6, targetVector = "binary", times = 5)