tpr.dag {HEMDAG} | R Documentation |
TPR-DAG ensemble variants
Description
Collection of the true-path-rule-based hierarchical learning ensemble algorithms and its variants.
TPR-DAG
is a family of algorithms on the basis of the choice of the bottom-up step adopted for the selection of
positive children (or descendants) and of the top-down step adopted to assure ontology-based predictions.
Indeed, in their more general form the TPR-DAG
algorithms adopt a two step learning strategy:
in the first step they compute a per-level bottom-up visit from leaves to root to propagate positive predictions across the hierarchy;
in the second step they compute a per-level top-down visit from root to leaves in order to assure the consistency of the predictions.
It is worth noting that levels (both in the first and second step) are defined in terms of the maximum distance from
the root node (see graph.levels
).
Usage
tpr.dag(
S,
g,
root = "00",
positive = "children",
bottomup = "threshold.free",
topdown = "gpav",
t = 0,
w = 0,
W = NULL,
parallel = FALSE,
ncores = 1
)
Arguments
S |
a named flat scores matrix with examples on rows and classes on columns. |
g |
a graph of class |
root |
name of the class that it is on the top-level of the hierarchy ( |
positive |
choice of the positive nodes to be considered in the bottom-up strategy. Can be one of the following values:
|
bottomup |
strategy to enhance the flat predictions by propagating the positive predictions from leaves to root. It can be one of the following values:
|
topdown |
strategy to make scores “hierarchy-aware”. It can be one of the following values: |
t |
threshold for the choice of positive nodes ( |
w |
weight to balance between the contribution of the node |
W |
vector of weight relative to a single example. If |
parallel |
a boolean value:
Use |
ncores |
number of cores to use for parallel execution. Set |
Details
The vanilla TPR-DAG
adopts a per-level bottom-up traversal of the DAG to correct the flat predictions
\hat{y}_i
according to the following formula:
\bar{y}_i := \frac{1}{1 + |\phi_i|} (\hat{y}_i + \sum_{j \in \phi_i} \bar{y}_j)
where \phi_i
are the positive children of i
.
Different strategies to select the positive children \phi_i
can be applied:
-
threshold-free strategy: the positive nodes are those children that can increment the score of the node
i
, that is those nodes that achieve a score higher than that of their parents:\phi_i := \{ j \in child(i) | \bar{y}_j > \hat{y}_i \}
-
threshold strategy: the positive children are selected on the basis of a threshold that can be selected in two different ways:
for each node a constant threshold
\bar{t}
is a priori selected:\phi_i := \{ j \in child(i) | \bar{y}_j > \bar{t} \}
For instance if the predictions represent probabilities it could be meaningful to a priori select
\bar{t}=0.5
.the threshold is selected to maximize some performance metric
\mathcal{M}
estimated on the training data, as for instance the Fmax or the AUPRC. In other words the threshold is selected to maximize some measure of accuracy of the predictions\mathcal{M}(j,t)
on the training data for the classj
with respect to the thresholdt
. The corresponding set of positives\forall i \in V
is:\phi_i := \{ j \in child(i) | \bar{y}_j > t_j^*, t_j^* = \arg \max_{t} \mathcal{M}(j,t) \}
For instance
t_j^*
can be selected from a set oft \in (0,1)
through internal cross-validation techniques.
The weighted TPR-DAG
version can be designed by adding a weight w \in [0,1]
to balance between the
contribution of the node i
and that of its positive children \phi
, through their convex combination:
\bar{y}_i := w \hat{y}_i + \frac{(1 - w)}{|\phi_i|} \sum_{j \in \phi_i} \bar{y}_j
If w=1
no weight is attributed to the children and the TPR-DAG
reduces to the HTD-DAG
algorithm, since in this
way only the prediction for node i
is used in the bottom-up step of the algorithm. If w=0
only the predictors
associated to the children nodes vote to predict node i
. In the intermediate cases we attribute more importance to the predictor for the
node i
or to its children depending on the values of w
.
By combining the weighted and the threshold variant, we design the weighted-threshold variant.
Since the contribution of the descendants of a given node decays exponentially with their distance from the node itself, to enhance the
contribution of the most specific nodes to the overall decision of the ensemble we design the ensemble variant DESCENS
.
The novelty of DESCENS
consists in strongly considering the contribution of all the descendants of each node instead of
only that of its children. Therefore DESCENS
predictions are more influenced by the information embedded in the leaves nodes,
that are the classes containing the most informative and meaningful information from a biological and medical standpoint.
For the choice of the “positive” descendants we use the same strategies adopted for the selection of the “positive”
children shown above. Furthermore, we designed a variant specific only for DESCENS
, that we named DESCENS
-\tau
.
The DESCENS
-\tau
variant balances the contribution between the “positives” children of a node i
and that of its “positives” descendants excluding its children by adding a weight \tau \in [0,1]
:
\bar{y}_i := \frac{\tau}{1+|\phi_i|}(\hat{y}_i + \sum_{j \in \phi_i} \bar{y}_j) + \frac{1-\tau}{1+|\delta_i|}(\hat{y}_i + \sum_{j\in \delta_i} \bar{y}_j)
where \phi_i
are the “positive” children of i
and \delta_i=\Delta_i \setminus \phi_i
the descendants of i
without its children.
If \tau=1
we consider only the contribution of the “positive” children of i
; if \tau=0
only the descendants that are not
children contribute to the score, while for intermediate values of \tau
we can balance the contribution of \phi_i
and
\delta_i
positive nodes.
Simply by replacing the HTD-DAG
top-down step (htd
) with the GPAV
approach (gpav
) we design the ISO-TPR
variant.
The most important feature of ISO-TPR
is that it maintains the hierarchical constraints by construction and it selects the closest
solution (in the least square sense) to the bottom-up predictions that obeys the True Path Rule.
Value
A named matrix with the scores of the classes corrected according to the chosen TPR-DAG
ensemble algorithm.
See Also
Examples
data(graph);
data(scores);
data(labels);
root <- root.node(g);
S.tpr <- tpr.dag(S, g, root, positive="children", bottomup="threshold.free",
topdown="gpav", t=0, w=0, W=NULL, parallel=FALSE, ncores=1);