icrsf {icRSF} | R Documentation |
Permutation-based variable importance metric for high dimensional datasets appropriate for time to event outcomes, in the presence of imperfect self-reports or laboratory-based diagnostic tests.
Description
Let N and P denote the number of subjects and number of variables in the dataset, respectively. Let N** denote the total number of visits, summed over all subjects in the study [i.e. N** denotes the number of diagnostic test results available for all subjects in the study]. This algorithm builds a user-defined number of survival trees, using bootstrapped datasets. Using the out of bag (OOB) data in each tree, a permutation-based measure of variable importance for each of the P variables is obtained.
Usage
icrsf(data, subject, testtimes, result, sensitivity, specificity, Xmat,
root.size, ntree, ns, node, pval = 1)
Arguments
data |
name of the data frame that includes the variables subject, testtimes, result |
subject |
vector of subject IDs of length N**x1. |
testtimes |
vector of visit or test times of length N**x1. |
result |
vector of binary diagnostic test results (0 = negative for event of interest; 1 = positive for event of interest) of length N**x1. |
sensitivity |
the sensitivity of the diagnostic test. |
specificity |
the specificity of the diagnostic test. |
Xmat |
a N x P matrix of covariates. |
root.size |
minimum number of subjects in a terminal node. |
ntree |
number of survival trees. |
ns |
number of covariate selected at each node to split the tree. |
node |
For parallel computation, specify the number of nodes. |
pval |
P-value threshold of the Likelihood Ratio Test. |
Value
a vector of the ensembled variable importance for modified random survival forest (icRSF).
Examples
library(parallel)
data(Xmat)
data(pheno)
vimp <- icrsf(data=pheno, subject=ID, testtimes=time, result=result, sensitivity=1,
specificity=1, Xmat=Xmat, root.size=30, ntree=1, ns=sqrt(ncol(Xmat)), node=1, pval=1)