DensityScatter.DDCAL {ScatterDensity}R Documentation

Scatter density plot [Brinkmann et al., 2023]

Description

Density estimation (PDE) [Ultsch, 2005] or "SDH" [Eilers/Goeman, 2004] used for a scatter density plot, with clustering of densities with DDCAL [Lux/Rinderle-Ma, 2023] proposed by [Brinkmann et al., 2023].

Usage

DensityScatter.DDCAL(X, Y, nClusters = 12, Plotter = "native", 
SDHorPDE = TRUE, PDEsample = 5000,
Marginals = FALSE, na.rm=TRUE,
pch = 10, Size = 1, 
xlab="x", ylab="y", main = "",lwd = 2,
xlim=NULL,ylim=NULL,Polygon,BW = TRUE,Silent = FALSE, ...)

Arguments

X

Numeric vector [1:n], first feature (for x axis values)

Y

Numeric vector [1:n], second feature (for y axis values)

nClusters

Integer defining the number of clusters (colors) used for finding a hard color transition.

Plotter

(Optional) String, name of the plotting backend to use. Possible values are: "native" or "ggplot2"

SDHorPDE

(Optional) Boolean, if TRUE SDH is used to calculate density, if FALSE PDE is used

PDEsample

(Optional) Scalar, Sample size for PDE and/or for ggplot2 plotting. Default is 5000

Marginals

(Optional) Boolean, if TRUE the marginal distributions of X and Y will be plotted together with the 2D density of X and Y. Default is FALSE

na.rm

(Optional) Boolean, if TRUE non finite values will be removed

pch

(Optional) Scalar or character. Indicates the shape of data points, see plot() function or the shape argument in ggplot2. Default is 10

Size

(Optional) Scalar, size of data points in plot, default is 1

xlab

String, title of the x axis. Default: "X", see plot() function

ylab

String, title of the y axis. Default: "Y", see plot() function

main

(Optional) Character, title of the plot. [1:2]

lwd

(Optional) Scalar, thickness of the lines used for the marginal distributions (only needed if Marginals=TRUE), see plot(). Default = 2

xlim

(Optional) numerical vector, min and max of x values to be plottet

ylim

(Optional) numerical vector, min and max of y values to be plottet

Polygon

(Optional) [1:p,1:2] numeric matrix that defines for x and y coordinates a polygon in magenta

BW

(Optional) Boolean, if TRUE ggplot2 will use a white background, if FALSE the typical ggplot2 backgournd is used. Not needed if "native" as Plotter is used. Default is TRUE

Silent

(Optional) Boolean, if TRUE no messages will be printed, default is FALSE

...

Further plot arguments

Details

The DensityScatter.DDCAL function generates the density of the xy data as a z coordinate. Afterwards xyz will be plotted as a contour plot. It assumens that the cases of x and y are mapped to each other meaning that a cbind(x,y) operation is allowed. The colors for the densities in the contour plot are calculated with DDCAL, which produces clusters to evenly distribute the densities in low variance clusters.

In the case of "native" as Plotter, the handle returns NULL because the basic R functon plot() is used

Value

If "ggplot2" as Plotter is used, the ggobj is returned

Note

Support for plotly will be implemented later

Author(s)

Luca Brinkmann, Michael Thrun

References

[Ultsch, 2005] Ultsch, A.: Pareto density estimation: A density estimation for knowledge discovery, In Baier, D. & Werrnecke, K. D. (Eds.), Innovations in classification, data science, and information systems, (Vol. 27, pp. 91-100), Berlin, Germany, Springer, 2005.

[Eilers/Goeman, 2004] Eilers, P. H., & Goeman, J. J.: Enhancing scatterplots with smoothed densities, Bioinformatics, Vol. 20(5), pp. 623-628. 2004.

[Lux/Rinderle-Ma, 2023] Lux, M. & Rinderle-Ma, S.: DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling, Journal of Classification vol. 40, pp. 106-144, 2023.

[Brinkmann et al., 2023] Brinkmann, L., Stier, Q., & Thrun, M. C.: Computing Sensitive Color Transitions for the Identification of Two-Dimensional Structures, Proc. Data Science, Statistics & Visualisation (DSSV) and the European Conference on Data Analysis (ECDA), p.109, Antwerp, Belgium, July 5-7, 2023.

Examples




# Create two bimodial distributions
x1=rnorm(n = 7500,mean = 0,sd = 1)
y1=rnorm(n = 7500,mean = 0,sd = 1)
x2=rnorm(n = 7500,mean = 2.5,sd = 1)
y2=rnorm(n = 7500,mean = 2.5,sd = 1)
x=c(x1,x2)
y=c(y1,y2)

DensityScatter.DDCAL(x, y, Marginals = TRUE)


[Package ScatterDensity version 0.0.4 Index]