R: hdClust

hdClust {SurvHiDim}

R Documentation

hdClust

Description

Creates a network plot of high dimensional variables and lists those variables.

Usage

hdClust(m, n, siglevel, u, ID, OS, Death, PFS, Prog, data)

Arguments

`m`	Starting column number from where variables of high dimensional data will be selected.
`n`	Ending column number till where variables of high dimensional data will get selected.
`siglevel`	Level of significance pre-determined by the user.
`u`	Factors of Event column e.g. 0,1 or 2 or Number of clusters to form.
`ID`	Column name of subject ID, a string value. i.e. "id"
`OS`	Column name of survival duration event, a string value. i.e. "os"
`Death`	Column name of survival event, a string value. i.e "death"
`PFS`	Column name of progression free survival duration, a string value. i.e "pfs"
`Prog`	Column name of progression event, a string value. i.e "prog"
`data`	High dimensional data having survival duration, event information, column of time for death cases and observations on various covariates under study.

Details

Gives network plot and lists the variables showing correlation.

hdClust function first creates a new column 'Status' in the input data set and assigns values 0, 1, 2 to each rows. It assigns 0 (for progression = 1 & death(event) = 0) is or (when progression = 0 & death(event) = 0. It assigns 1 (for progression = 1 & death = 1, whereas assigns 2 (for progression = 0 and death = 1).

Further, it creates two data sets, one data set named 'deathdata' which includes subjects with status 0 and 1 and applies Cox PH on it. Another data is named as 'compdata' which includes subjects with status 0 and 2, then applies Cox PH after substituting 2 by 1. Then it filters out study variables having P-value < siglevel(significance level taken as input from user) from both subset data. Secondly, it merges the common significant variables from both data and creates a new data frame which contains columns, 'ID','OS','Death','PFS','Prog','Status' and observations of common significant variables (which are supposed to be leading to death given they leads to progression of cancer as well as accounts for competing risks) Further, it lists the common variables names and correspond in results in .csv format by default in user's current working directory.

hdClust(m,n,siglevel,threshold,data),

1) Subject ID column should be named as 'ID'.

2) OS column must be named as 'OS'.

3) Death status/event column should be named as 'Death'.

4) Progression Fress Survival column should be named as 'PFS'.

5) Progression event column should be named as 'Prog'.

deathdata - A data frame with status column includes only those rows/subjects for which death/event was observed or not, given progression was observed or not.

compdata - A data frame with status column which includes those rows/subjects who died given progression was observed

data1variables - list of variables/genes from deathdata

data2variables - list of variables/genes from compdata

siginificantpvalueA - A data frame with estimate values, HR, Pvalue, etc. of significant variables from deathdata. siginificantpvalueB - A data frame with estimate values, HR, Pvalue, etc. of significant variables from compdata.

commongenes - A data frame consisting observations on common significant study variables.

cvar - List of common significant study variables.

commondata - A final data out consisting survival information ans observations on common significant study variables.

By default the fucntion stores the output in .csv forms in current directory of user.

Further it creates a cluster plot of variables of similar behavior.

Value

A list containing variable names and the correlation values.

Author(s)

Atanu Bhattacharjee and Akash Pawar

References

Bhattacharjee, A. (2020). Bayesian Approaches in Oncology Using R and OpenBUGS. CRC Press.

Congdon, P. (2014). Applied bayesian modelling (Vol. 595). John Wiley & Sons.

Banerjee, S., Vishwakarma, G. K., & Bhattacharjee, A. (2019). Classification Algorithm for High Dimensional Protein Markers in Time-course Data. arXiv preprint arXiv:1907.12853.

Examples

##
data(hnscc)
hdClust(7,105,0.05,2,ID="id",OS="os",Death="death",PFS="pfs",Prog="prog",hnscc)
##

[Package SurvHiDim version 0.1.1 Index]