mlhighHet {highMLR} | R Documentation |
mlhighHet
Description
This function extracts features based on ML method, finds optimal cut-off values of features using sequencial Cox PH model and obtain the most consistent level according to the cut-offs.
Usage
mlhighHet(cols, idSurv, idEvent, idFrail, num, fold = 3, data)
Arguments
cols |
A numeric vector of column numbers indicating the features for which the log Loss functions are to be computed |
idSurv |
The name of the survival time variable |
idEvent |
The name of the survival event variable |
idFrail |
The name of the frailty variable |
num |
Number of features to be selected |
fold |
An integer denoting number of folds in cross validation, default value 3 |
data |
A data frame that contains the survival and covariate information for the subjects |
Details
Performs heterogeneity analysis in gene expression
This function extracts features based on minimum log-Loss function using Cox proportional hazard model as learner method on a high dimensional survival data. For those selected genes, we obtain optimal cutoff values using minimum p-value in a Cox PH model. The Cox PH model is used sequencially for each combination of genes and all possible gene combinations are tested to obtain best possible combination with minimum BIC value. The subjects are classified according to different levels of those genes. Using a Cox PH frailty model, we obtain the most consistent level for which the frailty variance is minimum. The data is splited using cross validation technique. The performance measure is considered as logarithmic loss function. It is defined as,
L(f,t)=-log(f(t))
The CoxPH frailty model is defined as,
\lambda(t)=\lambda 0(t)\nu exp{X'\beta}
where \nu
is called the frailty. The variance of the
frailty term is considered as the heterogeneity among the subjects or patients. Gaussian distribution with mean 0 is considered for the distribution of frailty component.
Value
dataframes containing optimal gene cutoff values and most consistent level according to those cut-offs with frailty variance.
Author(s)
Atanu Bhattacharjee, Gajendra K. Vishwakarma & Souvik Banerjee
References
Sonabend, R., Király, F. J., Bender, A., Bernd Bischl B. and Lang M. mlr3proba: An R Package for Machine Learning in Survival Analysis, 2021, Bioinformatics, <https://doi.org/10.1093/bioinformatics/btab039>
Bhattacharjee, A. Vishwakarma, G.K. and Banerjee, S. A modified risk detection approach of biomarkers by frailty effect on multiple time to event data, 2020, <arXiv:2012.02102>.
See Also
mlhighCox, mlhighFrail
Examples
## Not run:
data(hnscc)
mlhighHet(cols=c(27:32), idSurv="OS", idEvent="Death", idFrail="ID", num=2, fold = 3, data=hnscc)
## End(Not run)