R: Prioritization of candidate cancer subtype-specific drugs...

PrioSubtypeDrug {SubtypeDrug}

R Documentation

Prioritization of candidate cancer subtype-specific drugs (PrioSubtypeDrug)

Description

Integrating drug, gene, and subpathway data to identify drugs specific to cancer subtypes.

Usage

PrioSubtypeDrug(
  expr,
  input.cls = "",
  control.label = "",
  subpathway.list,
  spw.min.sz = 10,
  spw.max.sz = Inf,
  spw.score.method = "gsva",
  kcdf = "Gaussian",
  drug.spw.data,
  drug.spw.p.val.th = 0.05,
  drug.spw.min.sz = 10,
  drug.spw.max.sz = Inf,
  weighted.drug.score = TRUE,
  nperm = 1000,
  parallel.sz = 1,
  E_FDR = 0.05,
  S_FDR = 0.001
)

Arguments

`expr`	Matrix of gene expression values (rows are genes, columns are samples).
`input.cls`	Input sample subtype class vector file in CLS format.
`control.label`	In the CLS file of 'input.cls', the label of the control sample.
`subpathway.list`	A list. The subpathway list data is mined from KEGG data is stored in the package 'SubtypeDrugData' and can be downloaded through the connection https://github.com/hanjunwei-lab/SubtypeDrugData. The gene tags included in the subpathway list data should be consistent with those in the gene expression profile. The package 'SubtypeDrugData' provides two choices that include the Entrezid and Symbol tags of the gene. Users can also enter their own pathway or gene set list data.
`spw.min.sz`	Removes subpathways that contain fewer genes than 'spw.min.sz' (default: 10).
`spw.max.sz`	Removes subpathways that contain more genes than 'spw.max.sz' (default: Inf).
`spw.score.method`	Method to employ in the estimation of subpathway enrichment scores per sample. By default this is set to 'gsva' (Hänzelmann et al, 2013) and other options are 'ssgsea' (Barbie et al, 2009).
`kcdf`	Character string denoting the kernel to use during the non-parametric estimation of the cumulative distribution function of expression levels across samples when 'spw.score.method="gsva"'. By default, 'kcdf="Gaussian"' which is suitable when input expression values are continuous, such as microarray fluorescent units in logarithmic scale, RNA-seq log-CPMs, log-RPKMs or log-TPMs. When input expression values are integer counts, such as those derived from RNA-seq experiments, then this argument should be set to 'kcdf="Poisson"'.
`drug.spw.data`	A list data of drug regulation. The drug subpathway association data we constructed is stored in package 'SubtypeDrugData' and can be downloaded via connection https://github.com/hanjunwei-lab/SubtypeDrugData. If the input is user-defined drug regulation data, the data should be a list data with each drug as its element. Each drug also contains 'Target_upregulation' and 'Target_downregulation' subpathway or gene set. Subpathway or gene set contained in drug regulation data should exist in input data of parameter 'subpathway.list'.
`drug.spw.p.val.th`	Parameter used only when 'drug.spw.data="DrugSpwData"'. According to the threshold of the significant P value set by parameter 'drug.spw.p.val.th' (default: 0.05), the drug up-regulation and down-regulatory subpathways were screened.
`drug.spw.min.sz`	A numeric. The drug regulated subpathways intersects with the subpathways in the subpathway activity profile. Then drugs with less than 'drug.spw.min.sz' (default: 10) up- or down-regulated subpathways are removed.
`drug.spw.max.sz`	A numeric. Similar to parameter 'drug.spw.min.sz', drugs with more than 'drug.spw.max.sz' (default: Inf) up- or down-regulated subpathways are removed.
`weighted.drug.score`	A boolean values determines the method for calculating the normalized drug-disease reverse association score of the drug for each sample. 'weighted.drug.score=TRUE' (default): KS random walk statistic with individualized subpathway activity aberrance score as weight was used to calculate the normalized drug-disease reverse association score. 'weighted.drug.score=FALSE': Similar to 'CMap' (Lamb et al., 2006), no weight is needed, and the normalized drug-disease reverse association score is calculated by the rank of the individualized subpathway activity aberrance score.
`nperm`	Number of random permutations (default: 1000).
`parallel.sz`	Number of processors to use when doing the calculations in parallel (default value: 1). If parallel.sz=0, then it will use all available core processors unless we set this argument with a smaller number.
`E_FDR`	Significance threshold for E_FDR for drugs (default: 0.05)
`S_FDR`	Significance threshold for S_FDR for drugs (default: 0.001)

Details

PrioSubtypeDrug

First, the function PrioSubtypeDrug uses the 'GSVA' or 'ssgsea' method to convert the disease gene expression profile into subpathway activity profile. Parameters 'subpathway.list', 'spw.min.sz' and 'spw.max.sz' are used to process the subpathway list data. 'spw.score.method' and 'kcdf' are used to control the method of constructing the subpathway activity score profile. Individualized subpathway activity aberrance score was estimated using the mean and standard deviation of the Control samples. Subpathways of each cancer sample are ordered in a ranked list according to individualized subpathway activity aberrance score. Next, we calculate the normalized drug-disease reverse association score by enriching drug regulated subpathway tags to the subpathway ranked list. Finlly, all drug-regulated subpathways are enriched into each cancer sample to obtain a normalized drug-disease reverse association score matrix. The 'drug.spw.p.val.th', 'drug.spw.min.sz' and 'drug.spw.max.sz' is used to screen the drug regulated subpathway set. If user-defined drug targeting data is used, drug regulated 'Target_upregulation' and 'Target_downregulation' should already be defined in the data. The 'weighted.drug.score' to control the method of calculating the normalized drug-disease reverse association score. Finally, empirical sample-based permutation test procedure to obtain significative cancer subtype specific drugs. For samples containing only cancer and Control, the subpathways are ranked according to the difference in activity between cancer and Control samples. Subsequently, the subpathway set of drug up- and down-regulated is enriched to the ranking list of subpathway to evaluate the normalized drug-disease reverse association score and subpathway-based permutation test procedure to calculate significance. The subpathway list data and drug subpathway associated data set is stored in package 'SubtypeDrugData' and can be obtained on https://github.com/hanjunwei-lab/SubtypeDrugData.

Value

A list contains the result table of drug scoring and significance, a subpathway activity score matrix, a normalized drug-disease reverse association score matrix, sample information, and user set parameter information.

Author(s)

Xudong Han, Junwei Han, Chonghui Liu

Examples

require(GSVA)
require(parallel)
## Get simulated breast cancer gene expression profile data.
Geneexp<-get("Geneexp")
## Obtain sample subtype data and calculate breast cancer subtype-specific drugs.
Subtype<-system.file("extdata", "Subtype_labels.cls", package = "SubtypeDrug")

## Subpathway list data and drug subpathway association data
## were stored in packet `SubtypeDrugData`.
## `SubtypeDrugData` has been uploaded to the github repository.
## If subpathway list data and drug subpathway association data are needed,
## users can download and install through `install_github` function and
## set parameter url=""hanjunwei-lab/SubtypeDrugData".
## After installing and loading package `SubtypeDrugData`,
## users can use the following command to get the data.
## Get subpathway list data.
## If the gene expression profile contains gene Symbol.
## data(SpwSymbolList)
## If the gene expression profile contains gene Entrezid.
## data(SpwEntrezidList)
## Get drug subpathway association data.
## data(DrugSpwData)

## Identify breast subtype-specific drugs.
## Subtype_drugs<-PrioSubtypeDrug(Geneexp,Subtype,"Control",SpwSymbolList,drug.spw.data=DrugSpwData,
##                                       E_FDR=1,S_FDR=1)

## Identify breast cancer-related drugs in only two types of samples: breast cancer and control.
Cancer<-system.file("extdata", "Cancer_normal_labels.cls", package = "SubtypeDrug")
## Disease_drugs<-PrioSubtypeDrug(Geneexp,Cancer,"Control",SpwSymbolList,drug.spw.data=DrugSpwData,
##                                       E_FDR=1,S_FDR=1)

## The function PrioSubtypeDrug() can also support user-defined data.
Geneexp<-get("GeneexpT")
## User-defined drug regulation data should resemble the structure below
UserDS<-get("UserDST")
str(UserDS)
## Need to load gene set data consistent with drug regulation data.
UserGS<-get("UserGST")
str(UserGS)
Drugs<-PrioSubtypeDrug(Geneexp,Cancer,"Control",UserGS,spw.min.sz=1,
                       drug.spw.data=UserDS,drug.spw.min.sz=1,
                       nperm=10,E_FDR=1,S_FDR=1)

[Package SubtypeDrug version 0.1.9 Index]