mlhighHet {highMLR}R Documentation

mlhighHet

Description

This function extracts features based on ML method, finds optimal cut-off values of features using sequencial Cox PH model and obtain the most consistent level according to the cut-offs.

Usage

mlhighHet(cols, idSurv, idEvent, idFrail, num, fold = 3, data)

Arguments

cols

A numeric vector of column numbers indicating the features for which the log Loss functions are to be computed

idSurv

The name of the survival time variable

idEvent

The name of the survival event variable

idFrail

The name of the frailty variable

num

Number of features to be selected

fold

An integer denoting number of folds in cross validation, default value 3

data

A data frame that contains the survival and covariate information for the subjects

Details

Performs heterogeneity analysis in gene expression

This function extracts features based on minimum log-Loss function using Cox proportional hazard model as learner method on a high dimensional survival data. For those selected genes, we obtain optimal cutoff values using minimum p-value in a Cox PH model. The Cox PH model is used sequencially for each combination of genes and all possible gene combinations are tested to obtain best possible combination with minimum BIC value. The subjects are classified according to different levels of those genes. Using a Cox PH frailty model, we obtain the most consistent level for which the frailty variance is minimum. The data is splited using cross validation technique. The performance measure is considered as logarithmic loss function. It is defined as,

L(f,t)=-log(f(t))

The CoxPH frailty model is defined as,

\lambda(t)=\lambda 0(t)\nu exp{X'\beta}

where \nu is called the frailty. The variance of the frailty term is considered as the heterogeneity among the subjects or patients. Gaussian distribution with mean 0 is considered for the distribution of frailty component.

Value

dataframes containing optimal gene cutoff values and most consistent level according to those cut-offs with frailty variance.

Author(s)

Atanu Bhattacharjee, Gajendra K. Vishwakarma & Souvik Banerjee

References

Sonabend, R., Király, F. J., Bender, A., Bernd Bischl B. and Lang M. mlr3proba: An R Package for Machine Learning in Survival Analysis, 2021, Bioinformatics, <https://doi.org/10.1093/bioinformatics/btab039>

Bhattacharjee, A. Vishwakarma, G.K. and Banerjee, S. A modified risk detection approach of biomarkers by frailty effect on multiple time to event data, 2020, <arXiv:2012.02102>.

See Also

mlhighCox, mlhighFrail

Examples

## Not run: 
data(hnscc)
mlhighHet(cols=c(27:32), idSurv="OS", idEvent="Death", idFrail="ID", num=2, fold = 3, data=hnscc)

## End(Not run)

[Package highMLR version 0.1.1 Index]