markrank {Corbi} | R Documentation |
MarkRank
Description
MarkRank is a novel proposed network-based model, which can identify the cooperative biomarkers for heterogeneous complex disease diagnoses.
Usage
markrank(
dataset,
label,
adj_matrix,
alpha = 0.8,
lambda = 0.2,
eps = 1e-10,
E_value = NULL,
trace = TRUE,
d = Inf,
Given_NET2 = NULL
)
Arguments
dataset |
The microarray expression matrix of related disease. Each row represents a sample and each column represents a gene. |
label |
The 0-1 binary phenotype vector of dataset samples. The size of label must accord with the sample number in dataset. |
adj_matrix |
The 0-1 binary adjacent matrix of a connected biological network. Here the node set should be the same order as the gene set in expression matrix. |
alpha |
The convex combination coefficient of network effect and prior information vector |
lambda |
In the random walk-based iteration, matrix A1 reflects the stucture information of the
biological network, whereas matrix A2 reflects the cooperative effect of gene combinations.
Parameter lambda is the convex combination coefficient of two network effects. The range of lambda is
in |
eps |
The stop criteria for the iterative solution method. The default value is 1e-10. |
E_value |
A vector containing the prior information about the importance of nodes. Default is the absolute Pearson correlation coefficient (PCC). |
trace |
Locaical variable indicated whether tracing information on the progress of the gene cooperation network construction is produced. |
d |
Threshold for simplifying the G_2 computation. Only the gene pairs whose shortest distances in PPI network are less than d participate in the G_2 computation. The default value is Inf. |
Given_NET2 |
Whether a computed cooperation network is given for tuning parameter. See Details for a more specific description. |
Details
MarkRank is a network-based biomarker identification method to prioritize disease genes by integrating multi-source information including the biological network, e.g protein-protein interaction (PPI) network, the prior information about related diseases, and the discriminative power of cooperative gene combinations. MarkRank shows that explicit modeling of gene cooperative effects can greatly improve biomarker identification for complex disease, especially for diseases with high heterogeneity.
MarkRank algorithm contains mainly two steps: 1) The construction of gene cooperation network G_2
and 2) a random walk based iteration procedure. The following descriptions will help the users to
using markrank
more convenient:
1) As for the construction of the gene cooperation network, we suggest the user to set
trace=TRUE
to output the G_2 computation process. The G_2 construction step finished
if the output number is identical to the gene number of the input expression matrix. The parameter d
introduced the structure information of used biological network to facilitate the construction
of G_2, only the gene pairs whose shortest distances in network are less than d
participate
the G_2 computation. We suggest d=Inf
, the default value, to fully use the information of expression
matrix. If the user given a preset d
, the distance matrix of input network dis
will be returned.
2) MarkRank uses a random-walk based iteration procedure to score each gene. The detailed formula is:
score
= alpha
*[lambda
*A1 + (1-lambda
)*A2]*score
+ (1-alpha
)*E_value
.
The users could set an appropriate parameter settings in their pracitical application.
Our suggested value is alpha
=0.8 and lambda
=0.2. The model input parameter combinations and iteration steps will
be returned in output components initial_pars
and steps
, respectively. Because the iteration step is separate with
the cooperation network construction, the user can use the parameter Given_NET2
to tune
the model parameters. In detail, the user could set
Given_NET2 = result$NET2
in markrank
input to avoid the repeated computation of G_2, where the object result
is the returned variable of markrank
function.
3) The final MarkRank score for each gene is in output score
. The users could sort
this result and use the top ranked genes for further analysis.
Value
This function will return a list with the following components:
score |
The vector of final MarkRank scores for each gene. |
steps |
The final iteration steps in random walk based scoring procedure. |
NET2 |
The weighted adjacent matrix of gene cooperation network. |
initial_pars |
The initial/input parameter values used in MarkRank. |
dis |
The pairwise distance matrix of input network. This variable will be |
References
Duanchen Sun, Xianwen Ren, Eszter Ari, Tamas Korcsmaros, Peter Csermely, Ling-Yun Wu. Discovering cooperative biomarkers for heterogeneous complex disease diagnoses. Briefings in Bioinformatics, 20(1), 89–101, 2019.