DensityScatter.DDCAL {ScatterDensity} | R Documentation |
Scatter density plot [Brinkmann et al., 2023]
Description
Density estimation (PDE) [Ultsch, 2005] or "SDH" [Eilers/Goeman, 2004] used for a scatter density plot, with clustering of densities with DDCAL [Lux/Rinderle-Ma, 2023] proposed by [Brinkmann et al., 2023].
Usage
DensityScatter.DDCAL(X, Y, nClusters = 12, Plotter = "native",
SDHorPDE = TRUE, PDEsample = 5000,
Marginals = FALSE, na.rm=TRUE,
pch = 10, Size = 1,
xlab="x", ylab="y", main = "",lwd = 2,
xlim=NULL,ylim=NULL,Polygon,BW = TRUE,Silent = FALSE, ...)
Arguments
X |
Numeric vector [1:n], first feature (for x axis values) |
Y |
Numeric vector [1:n], second feature (for y axis values) |
nClusters |
Integer defining the number of clusters (colors) used for finding a hard color transition. |
Plotter |
(Optional) String, name of the plotting backend to use. Possible values are: " |
SDHorPDE |
(Optional) Boolean, if TRUE SDH is used to calculate density, if FALSE PDE is used |
PDEsample |
(Optional) Scalar, Sample size for PDE and/or for ggplot2 plotting. Default is 5000 |
Marginals |
(Optional) Boolean, if TRUE the marginal distributions of X and Y will be plotted together with the 2D density of X and Y. Default is FALSE |
na.rm |
(Optional) Boolean, if TRUE non finite values will be removed |
pch |
(Optional) Scalar or character. Indicates the shape of data points, see |
Size |
(Optional) Scalar, size of data points in plot, default is 1 |
xlab |
String, title of the x axis. Default: "X", see |
ylab |
String, title of the y axis. Default: "Y", see |
main |
(Optional) Character, title of the plot. [1:2] |
lwd |
(Optional) Scalar, thickness of the lines used for the marginal distributions (only needed if |
xlim |
(Optional) numerical vector, min and max of x values to be plottet |
ylim |
(Optional) numerical vector, min and max of y values to be plottet |
Polygon |
(Optional) [1:p,1:2] numeric matrix that defines for x and y coordinates a polygon in magenta |
BW |
(Optional) Boolean, if TRUE ggplot2 will use a white background, if FALSE the typical ggplot2 backgournd is used. Not needed if " |
Silent |
(Optional) Boolean, if TRUE no messages will be printed, default is FALSE |
... |
Further plot arguments |
Details
The DensityScatter.DDCAL
function generates the density of the xy data as a z coordinate. Afterwards xyz will be plotted as a contour plot. It assumens that the cases of x and y are mapped to each other meaning that a cbind(x,y)
operation is allowed.
The colors for the densities in the contour plot are calculated with DDCAL, which produces clusters to evenly distribute the densities in low variance clusters.
In the case of "native
" as Plotter, the handle returns NULL
because the basic R functon plot
() is used
Value
If "ggplot2
" as Plotter is used, the ggobj is returned
Note
Support for plotly will be implemented later
Author(s)
Luca Brinkmann, Michael Thrun
References
[Ultsch, 2005] Ultsch, A.: Pareto density estimation: A density estimation for knowledge discovery, In Baier, D. & Werrnecke, K. D. (Eds.), Innovations in classification, data science, and information systems, (Vol. 27, pp. 91-100), Berlin, Germany, Springer, 2005.
[Eilers/Goeman, 2004] Eilers, P. H., & Goeman, J. J.: Enhancing scatterplots with smoothed densities, Bioinformatics, Vol. 20(5), pp. 623-628. 2004.
[Lux/Rinderle-Ma, 2023] Lux, M. & Rinderle-Ma, S.: DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling, Journal of Classification vol. 40, pp. 106-144, 2023.
[Brinkmann et al., 2023] Brinkmann, L., Stier, Q., & Thrun, M. C.: Computing Sensitive Color Transitions for the Identification of Two-Dimensional Structures, Proc. Data Science, Statistics & Visualisation (DSSV) and the European Conference on Data Analysis (ECDA), p.109, Antwerp, Belgium, July 5-7, 2023.
Examples
# Create two bimodial distributions
x1=rnorm(n = 7500,mean = 0,sd = 1)
y1=rnorm(n = 7500,mean = 0,sd = 1)
x2=rnorm(n = 7500,mean = 2.5,sd = 1)
y2=rnorm(n = 7500,mean = 2.5,sd = 1)
x=c(x1,x2)
y=c(y1,y2)
DensityScatter.DDCAL(x, y, Marginals = TRUE)