DensityScatter {DataVisualizations} | R Documentation |
Scatter plot with densities
Description
Density estimation is performed by (PDE) [Ultsch, 2005] or "SDH" [Eilers/Goeman, 2004] and visualized in a density scatter plot [Brinkmann et al., 2023] in which the points are colored by their density.
Usage
DensityScatter(X,Y,DensityEstimation="SDH",
Type="DDCAL", Plotter = "native",Marginals = FALSE,
SampleSize,na.rm=FALSE, xlab, ylab,
main="DensityScatter", AddString2lab="",
xlim, ylim,NoBinsOrPareto=NULL,...)
Arguments
X |
Numeric vector [1:n], first feature (for x axis values) |
Y |
Numeric vector [1:n], second feature (for y axis values) |
DensityEstimation |
(Optional), |
Type |
(Optional), |
Plotter |
in case of |
Marginals |
(Optional) Boolean, if TRUE the marginal distributions of X and Y will be plotted together with the 2D density of X and Y. Default is FALSE |
SampleSize |
(Optional), Numeric, positiv scalar, maximum size of the sample used for calculation. High values increase runtime significantly. The default is that no sample is drawn |
na.rm |
(Optional), Function may not work with non finite values. If these cases should be automatically removed, set parameter TRUE |
xlab |
(Optional), String, title of the x axis. Default: "X", see |
ylab |
(Optional), String, title of the y axis. Default: "Y", see |
main |
(Optional), string, the same as "main" in |
AddString2lab |
(Optional), adds the same string of information to x and y axis label, e.g. usefull for adding SI units |
xlim |
(Optional), in case of |
ylim |
in case of |
NoBinsOrPareto |
(Optional), in case of |
... |
(Optional), further arguments either to ScatterDenstiy::DensityScatter.DDCAL or to plot() |
Details
The DensityScatter
function generates the density of the xy data as a z coordinate. Afterwards xy points will be plotted as a scatter plot, where the z values defines the coloring of the xy points. It assumens that the cases of x and y are mapped to each other meaning that a cbind(x,y)
operation is allowed.
This function plots the Density on top of a scatterplot. Variances of x and y should not differ by extreme numbers, otherwise calculate the percentiles on both first.
Value
List of:
X |
Numeric vector [1:m],m<=n, first feature used in the plot or the kernels used |
Y |
Numeric vector [1:m],m<=n, second feature used in the plot or the kernels used |
Densities |
Number of points within the ParetoRadius of each point, i.e. density information |
Note
MT contributed with several adjustments
Author(s)
Felix Pape
References
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, (Ultsch, A. & Huellermeier, E. Eds., 10.1007/978-3-658-20540-9), Doctoral dissertation, Heidelberg, Springer, ISBN: 978-3658205393, 2018.
[Thrun/Ultsch, 2018] Thrun, M. C., & Ultsch, A. : Effects of the payout system of income taxes to municipalities in Germany, in Papiez, M. & Smiech,, S. (eds.), Proc. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, pp. 533-542, Cracow: Foundation of the Cracow University of Economics, Cracow, Poland, 2018.
[Ultsch, 2005] Ultsch, A.: Pareto density estimation: A density estimation for knowledge discovery, In Baier, D. & Werrnecke, K. D. (Eds.), Innovations in classification, data science, and information systems, (Vol. 27, pp. 91-100), Berlin, Germany, Springer, 2005.
[Eilers/Goeman, 2004] Eilers, P. H., & Goeman, J. J.: Enhancing scatterplots with smoothed densities, Bioinformatics, Vol. 20(5), pp. 623-628. 2004
[Lux/Rinderle-Ma, 2023] Lux, M. & Rinderle-Ma, S.: DDCAL: Evenly Distributing Data into Low Variance Clusters Based on Iterative Feature Scaling, Journal of Classification vol. 40, pp. 106-144, 2023.
[Brinkmann et al., 2023] Brinkmann, L., Stier, Q., & Thrun, M. C.: Computing Sensitive Color Transitions for the Identification of Two-Dimensional Structures, Proc. Data Science, Statistics & Visualisation (DSSV) and the European Conference on Data Analysis (ECDA), p.109, Antwerp, Belgium, July 5-7, 2023.
Examples
#taken from [Thrun/Ultsch, 2018]
data("ITS")
data("MTY")
Inds=which(ITS<900&MTY<8000)
plot(ITS[Inds],MTY[Inds],main='Bimodality is not visible in normal scatter plot')
DensityScatter(ITS[Inds],MTY[Inds],DensityEstimation="SDH",xlab = 'ITS in EUR',
ylab ='MTY in EUR' ,main='Smoothed Densities histogram indicates Bimodality' )
DensityScatter(ITS[Inds],MTY[Inds],DensityEstimation="PDE",xlab = 'ITS in EUR',
ylab ='MTY in EUR' ,main='PDE indicates Bimodality' )