plot1GS {TcGSA} | R Documentation |
Plotting a Specific Gene Set
Description
This function can plot different representations of the gene expression in a specific gene set.
Usage
plot1GS(
expr,
gmt,
Subject_ID,
TimePoint,
geneset.name,
baseline = NULL,
group.var = NULL,
Group_ID_paired = NULL,
ref = NULL,
group_of_interest = NULL,
FUNcluster = NULL,
clustering_metric = "euclidian",
clustering_method = "ward",
B = 500,
max_trends = 4,
aggreg.fun = "median",
na.rm.aggreg = TRUE,
trend.fun = "median",
methodOptiClust = "firstSEmax",
indiv = "genes",
verbose = TRUE,
clustering = TRUE,
showTrend = TRUE,
smooth = TRUE,
precluster = NULL,
time_unit = "",
title = NULL,
y.lab = NULL,
desc = TRUE,
lab.cex = 1,
axis.cex = 1,
main.cex = 1,
y.lab.angle = 90,
x.axis.angle = 45,
margins = 1,
line.size = 1,
y.lim = NULL,
x.lim = NULL,
gg.add = list(theme()),
plot = TRUE
)
Arguments
expr |
either a matrix or dataframe of gene expression upon which
dynamics are to be calculated, or a list of gene sets estimation of gene
expression. In the case of a matrix or dataframe, its dimension are |
gmt |
a gmt object containing the gene sets definition. See
|
Subject_ID |
a factor of length |
TimePoint |
a numeric vector or a factor of length |
geneset.name |
a character string containing the name of the gene set to
be plotted, that must appear in the |
baseline |
a character string which is the value of |
group.var |
in the case of several treatment groups, this is a factor of
length |
Group_ID_paired |
a character vector of length |
ref |
the group which is used as reference in the case of several
treatment groups. Default is |
group_of_interest |
the group of interest, for which dynamics are to be
computed in the case of several treatment groups. Default is |
FUNcluster |
a function which accepts as first argument a matrix
|
clustering_metric |
character string specifying the metric to be used
for calculating dissimilarities between observations in the hierarchical
clustering when |
clustering_method |
character string defining the agglomerative method
to be used in the hierarchical clustering when |
B |
integer specifying the number of Monte Carlo ("bootstrap") samples
used to compute the gap statistics. Default is |
max_trends |
integer specifying the maximum number of different clusters
to be tested. Default is |
aggreg.fun |
a character string such as |
na.rm.aggreg |
a logical flag indicating whether |
trend.fun |
a character string such as |
methodOptiClust |
character string indicating how the "optimal" number
of clusters is computed from the gap statistics and their standard
deviations. Possible values are |
indiv |
a character string indicating by which unit observations are
aggregated (through |
verbose |
logical flag enabling verbose messages to track the computing
status of the function. Default is |
clustering |
logical flag. If |
showTrend |
logical flag. If |
smooth |
logical flag. If |
precluster |
a vector of length |
time_unit |
the time unit to be displayed (such as |
title |
character specifying the title of the plot. If |
y.lab |
character specifying the annotation of the y axis. If |
desc |
a logical flag. If |
lab.cex |
a numerical value giving the amount by which lab labels text
should be magnified relative to the default |
axis.cex |
a numerical value giving the amount by which axis annotation
text should be magnified relative to the default |
main.cex |
a numerical value giving the amount by which title text
should be magnified relative to the default |
y.lab.angle |
a numerical value (in [0, 360]) giving the orientation by
which y-label text should be turned (anti-clockwise). Default is |
x.axis.angle |
a numerical value (in [0, 360]) giving the orientation by
which x-axis annotation text should be turned (anti-clockwise). Default is
|
margins |
a numerical value giving the amount by which the margins
should be reduced or increased relative to the default |
line.size |
a numerical value giving the amount by which the line sizes
should be reduced or increased relative to the default |
y.lim |
a numeric vector of length 2 giving the range of the y-axis.
See |
x.lim |
if numeric, will create a continuous scale, if factor or
character, will create a discrete scale. Observations not in this range will
be dropped. See |
gg.add |
A list of instructions to add to the |
plot |
logical flag. If |
Details
If expr
is a matrix or a dataframe, then the "original" data are
plotted. On the other hand, if expr
is a list returned in the
'Estimations'
element of TcGSA.LR
, then it is those
"estimations" made by the TcGSA.LR
function that are plotted.
If indiv
is 'genes', then each line of the plot is the median of a
gene expression over the patients. On the other hand, if indiv
is
'patients', then each line of the plot is the median of a patient genes
expression in this gene set.
This function uses the Gap statistics to determine the optimal number of
clusters in the plotted gene set. See
clusGap
.
Value
A list with 2 elements:
-
classif
: adata.frame
with the 2 following variables:ProbeID
which contains the IDs of the probes of the plotted gene set, andCluster
containing $ which cluster the probe belongs to. Ifclustering
isFALSE
, thenCluster
isNA
for all the probes. -
p
: aggplot
object containing the plot
Author(s)
Boris P. Hejblum
References
Tibshirani, R., Walther, G. and Hastie, T., 2001, Estimating the number of data clusters via the Gap statistic, Journal of the Royal Statistical Society, Series B (Statistical Methodology), 63, 2: 411–423.
See Also
Examples
if(interactive()){
data(data_simu_TcGSA)
tcgsa_sim_1grp <- TcGSA.LR(expr=expr_1grp, gmt=gmt_sim, design=design,
subject_name="Patient_ID", time_name="TimePoint",
time_func="linear", crossedRandom=FALSE)
plot1GS(expr=expr_1grp, TimePoint=design$TimePoint,
Subject_ID=design$Patient_ID, gmt=gmt_sim,
geneset.name="Gene set 4",
indiv="genes", clustering=FALSE,
time_unit="H",
lab.cex=0.7)
plot1GS(expr=expr_1grp, TimePoint=design$TimePoint,
Subject_ID=design$Patient_ID, gmt=gmt_sim,
geneset.name="Gene set 5",
indiv="patients", clustering=FALSE, baseline=1,
time_unit="H",
lab.cex=0.7)
}
if(interactive()){
geneclusters <- plot1GS(expr=tcgsa_sim_1grp$Estimations, TimePoint=design$TimePoint,
Subject_ID=design$Patient_ID, gmt=gmt_sim,
geneset.name="Gene set 5",
indiv="genes",
time_unit="H",
lab.cex=0.7
)
geneclusters
}
if(interactive()){
library(grDevices)
library(graphics)
colval <- c(hsv(0.56, 0.9, 1),
hsv(0, 0.27, 1),
hsv(0.52, 1, 0.5),
hsv(0, 0.55, 0.97),
hsv(0.66, 0.15, 1),
hsv(0, 0.81, 0.55),
hsv(0.7, 1, 0.7),
hsv(0.42, 0.33, 1)
)
n <- length(colval); y <- 1:n
op <- par(mar=rep(1.5,4))
plot(y, axes = FALSE, frame.plot = TRUE,
xlab = "", ylab = "", pch = 21, cex = 8,
bg = colval, ylim=c(-1,n+1), xlim=c(-1,n+1),
main = "Color scale"
)
par(op)
plot1GS(expr=expr_1grp, TimePoint=design$TimePoint,
Subject_ID=design$Patient_ID, gmt=gmt_sim,
geneset.name="Gene set 5",
indiv="genes",
time_unit="H",
title="",
gg.add=list(scale_color_manual(values=colval),
guides(colour = guide_legend(reverse=TRUE))),
lab.cex=0.7
)
plot1GS(expr=expr_2grp, TimePoint=design$TimePoint,
Subject_ID=design$Patient_ID, gmt=gmt_sim,
geneset.name="Gene set 3",
indiv="genes",
group.var = design$group.var,
time_unit="H",
gg.add=list(scale_color_manual(values=colval),
guides(colour = guide_legend(reverse=TRUE))),
lab.cex=0.7
)
}