gene_selection_surv {GSSTDA}R Documentation

Gene selection based on variability and the relationship to survival.

Description

It selects genes for mapper based on the product of standard deviation of the rows (genes) in the disease component matrix plus one times the Z score obtained by fitting a cox proportional hazard model to the level of each gene. For further information see "Topology based data analysis identifies a subgroup of breast cancers with a unique mutational profile and excellent survival"

Usage

gene_selection_surv(
  case_disease_component,
  cox_all_matrix,
  gen_select_type,
  num_gen_select
)

Arguments

case_disease_component

Disease component matrix (output of the function generate_disease_component) having selected only the columns belonging to disease samples. The names of the rows must be the names of the genes.

cox_all_matrix

Output from the cox_all_genes function. Data.frame with information on the relationship between genes and survival.

gen_select_type

Option. Select the "Abs" option, which means that the genes with the highest absolute value are chosen, or the "Top_Bot" option, which means that half of the selected genes are those with the highest value (positive value, i.e. worst survival prognosis) and the other half are those with the lowest value (negative value, i.e. best prognosis).

num_gen_select

Number of genes to be selected (those with the highest product value).

Value

Character vector with the names of the selected genes.


[Package GSSTDA version 1.0.0 Index]