example.datasets {HEMDAG} | R Documentation |
Small real example datasets
Description
Collection of real sub-datasets used in the examples of the HEMDAG package
Usage
data(graph)
data(labels)
data(scores)
data(wadj)
data(test.index)
Details
The DAG g
contained in graph
data is an object of class graphNEL
. The graph g
has 23 nodes and 30 edges and
represents the "ancestors view" of the HPO term Camptodactyly of finger ("HP:0100490"
).
The matrix L
contained in the labels
data is a 100 X 23 matrix, whose rows correspond to genes
(Entrez GeneID) and columns to HPO classes.
L[i,j]=1
means that the gene i
belong to class j
, L[i,j]=0
means that the gene i
does not belong to class j
.
The classes of the matrix L
correspond to the nodes of the graph g
.
The matrix S
contained in the scores
data is a named 100 X 23 flat scores matrix, representing the likelihood
that a given gene belongs to a given class: higher the value higher the likelihood. The classes of the matrix S
correspond
to the nodes of the graph g
.
The matrix W
contained in the wadj
data is a named 100 X 100 symmetric weighted adjacency matrix, whose rows and
columns correspond to genes.The genes names (Entrez GeneID) of the adjacency matrix W
correspond to the genes names of the
flat scores matrix S
and to genes names of the target multilabel matrix L
.
The vector of integer numbers test.index
contained in the test.index
data refers to the index of the examples of the scores
matrix S
to be used in the test set. It is useful only in holdout experiments.
Note
Some examples of full data sets for the prediction of HPO terms are available at the following link.
Note that the processing of the full datasets should be done similarly to the processing of the small data examples provided directly in this package.
Please read the README
clicking the link above to know more details about the available full datasets.