R: Cross validation for sequentially increases taxa

CVSITaxa {MicrobiomeSurv}

R Documentation

Cross validation for sequentially increases taxa

Description

This function does cross validation for the taxon by taxon analysis while sequentially increasing the number of taxa as specified.

Usage

CVSITaxa(
  Object,
  Top = seq(5, 100, by = 5),
  Survival,
  Censor,
  Prognostic = NULL
)

Arguments

`Object`	An object of class `cvmm`.
`Top`	The Top k number of taxa to be used.
`Survival`	A vector of survival time with length equals to number of subjects.
`Censor`	A vector of censoring indicator.
`Prognostic`	A dataframe containing possible prognostic(s) factor and/or treatment effect to be used in the model.

Details

The function is a cross validation version of the function SITaxa. This function firstly processes the cross validation for the taxon by taxon analysis results, and then sequentially considers top k taxa. The function recompute first PCA or PLS on train data and estimate risk scores on both test and train data only on the microbiome matrix with top k taxa. Patients are then classified as having low or high risk based on the test data where the cutoff used is mean of the risk score. The process is repeated for each top K taxa sets.

Value

A object of class cvsit is returned with the following values

`HRpca`	A 3-way array in which first, second, and third dimensions correspond to number of taxa, Hazard ratio infromation(Estimated HR, LowerCI and UpperCI), and number of cross validation respectively. This contains the estimated HR on test data and dimension reduction method is PCA.
`HRpls`	A 3-way array in which first, second, and third dimensions correspond to number of taxa, Hazard ratio infromation(Estimated HR, LowerCI and UpperCI), and number of cross validation respectively. This contains the estimated HR on test data and dimension reduction method is PLS.
`Ntaxa`	The number of taxa in the reduced matrix.
`Ncv`	The number of cross validation done.
`Top`	A sequence of top k taxa considered. Default is Top = seq(5, 100, by=5)

Author(s)

Thi Huyen Nguyen, thihuyen.nguyen@uhasselt.be

Olajumoke Evangelina Owokotomo, olajumoke.x.owokotomo@gsk.com

Ziv Shkedy

Examples

# Prepare data
data(Week3_response)
Week3_response = data.frame(Week3_response)
surv_fam_shan_w3 = data.frame(cbind(as.numeric(Week3_response$T1Dweek),
as.numeric(Week3_response$T1D)))
colnames(surv_fam_shan_w3) = c("Survival", "Censor")
prog_fam_shan_w3 = data.frame(factor(Week3_response$Treatment_new))
colnames(prog_fam_shan_w3) = c("Treatment")
data(fam_shan_trim_w3)
names_fam_shan_trim_w3 =
c("Unknown", "Lachnospiraceae", "S24.7", "Lactobacillaceae", "Enterobacteriaceae", "Rikenellaceae")
fam_shan_trim_w3 = data.matrix(fam_shan_trim_w3[ ,2:82])
rownames(fam_shan_trim_w3) = names_fam_shan_trim_w3
# Getting the cvmm object
CVCox_taxon_fam_shan_w3 = CVMSpecificCoxPh(Fold=3,
                                           Survival = surv_fam_shan_w3$Survival,
                                           Micro.mat = fam_shan_trim_w3,
                                           Censor = surv_fam_shan_w3$Censor,
                                           Reduce=TRUE,
                                           Select=5,
                                           Prognostic=prog_fam_shan_w3,
                                           Mean = TRUE,
                                           Ncv=10)

# Using the function
 CVSITaxa_fam_shan_w3 = CVSITaxa(Object = CVCox_taxon_fam_shan_w3,
                                 Top=seq(1, 6, by=2),
                                 Survival = surv_fam_shan_w3$Survival,
                                 Censor = surv_fam_shan_w3$Censor,
                                 Prognostic=prog_fam_shan_w3)

# Get the class of the object
class(CVSITaxa_fam_shan_w3)     # An "cvsit" Class

[Package MicrobiomeSurv version 0.1.0 Index]