| DataVisualizations-package {DataVisualizations} | R Documentation |
Visualizations of High-Dimensional Data
Description
Gives access to data visualisation methods that are relevant from the data scientist's point of view. The flagship idea of 'DataVisualizations' is the mirrored density plot (MD-plot) for either classified or non-classified multivariate data published in Thrun, M.C. et al.: "Analyzing the Fine Structure of Distributions" (2020), PLoS ONE, <DOI:10.1371/journal.pone.0238835>. The MD-plot outperforms the box-and-whisker diagram (box plot), violin plot and bean plot and geom_violin plot of ggplot2. Furthermore, a collection of various visualization methods for univariate data is provided. In the case of exploratory data analysis, 'DataVisualizations' makes it possible to inspect the distribution of each feature of a dataset visually through a combination of four methods. One of these methods is the Pareto density estimation (PDE) of the probability density function (pdf). Additionally, visualizations of the distribution of distances using PDE, the scatter-density plot using PDE for two variables as well as the Shepard density plot and the Bland-Altman plot are presented here. Pertaining to classified high-dimensional data, a number of visualizations are described, such as f.ex. the heat map and silhouette plot. A political map of the world or Germany can be visualized with the additional information defined by a classification of countries or regions. By extending the political map further, an uncomplicated function for a Choropleth map can be used which is useful for measurements across a geographic area. For categorical features, the Pie charts, slope charts and fan plots, improved by the ABC analysis, become usable. More detailed explanations are found in the book by Thrun, M.C.: "Projection-Based Clustering through Self-Organization and Swarm Intelligence" (2018) <DOI:10.1007/978-3-658-20540-9>.
Details
For a brief introduction to DataVisualizations please see the vignette A Quick Tour in Data Visualizations.
Please see https://www.deepbionics.org/. Depending on the context please cite either [Thrun, 2018] regarding visualizations in the context of clustering or [Thrun/Ultsch, 2018] for other visualizations.
For the Mirrored Density Plot (MD plot) please cite [Thrun et al., 2020] and see the extensive vignette in https://md-plot.readthedocs.io/en/latest/index.html. The MD plot is also available in Python https://pypi.org/project/md-plot/
Index of help topics:
ABCbarplot Barplot with Sorted Data Colored by ABCanalysis
AccountingInformation_PrimeStandard_Q3_2019
Accounting Information in the Prime Standard in
Q3 in 2019 (AI_PS_Q3_2019)
BimodalityAmplitude Bimodality Amplitude
CCDFplot plot Complementary Cumulative Distribution
Function (CCDF) in Log/Log uses ecdf, CCDF(x) =
1-cdf(x)
ChoroplethPostalCodesAndAGS_Germany
Postal Codes and AGS of Germany for a
Choropleth Map
Choroplethmap Plots the Choropleth Map
ClassBoxplot Creates Boxplot plot for all classes
ClassErrorbar ClassErrorbar
ClassMDplot Class MDplot for Data w.r.t. all classes
ClassPDEplot PDE Plot for all classes
ClassPDEplotMaxLikeli Create PDE plot for all classes with maximum
likelihood
Classplot Classplot
CombineCols Combine vectors of various lengths
Crosstable Crosstable plot
DataVisualizations-package
Visualizations of High-Dimensional Data
DefaultColorSequence Default color sequence for plots
DensityContour Contour plot of densities
DensityScatter Scatter plot with densities
DualaxisClassplot Dualaxis Classplot
DualaxisLinechart DualaxisLinechart
Fanplot The fan plot
FundamentalData_Q1_2018
Fundamental Data of the 1st Quarter in 2018
GoogleMapsCoordinates Google Maps with marked coordinates
Heatmap Heatmap for Clustering
HeatmapColors Default color sequence for plots
ITS Income Tax Share
InspectBoxplots Inspect Boxplots
InspectCorrelation Inspect the Correlation
InspectDistances Inspection of Distance-Distribution
InspectScatterplots Pairwise scatterplots and optimal histograms
InspectStandardization
QQplot of Data versus Normalized Data
InspectVariable Visualization of Distribution of one variable
JitterUniqueValues Jitters Unique Values
Lsun3D Lsun3D inspired by FCPS [Thrun/Ultsch, 2020]
introduced in [Thrun, 2018]
MAplot Minus versus Add plot
MDplot Mirrored Density plot (MD-plot)
MDplot4multiplevectors
Mirrored Density plot (MD-plot)for Multiple
Vectors
MTY Muncipal Income Tax Yield
Multiplot Plot multiple ggplots objects in one panel
OptimalNoBins Optimal Number Of Bins
PDEplot PDE plot
ParetoDensityEstimation
Pareto Density Estimation V3
ParetoRadius ParetoRadius for distributions
Piechart The pie chart
Pixelmatrix Plot of a Pixel Matrix
Plot3D 3D plot of points
PlotGraph2D PlotGraph2D
PlotMissingvalues Plot of the Amount Of Missing Values
PlotProductratio Product-Ratio Plot
PmatrixColormap P-Matrix colors
QQplot QQplot with a Linear Fit
ROC ROC plot
RobustNorm_BackTrafo Transforms the Robust Normalization back
RobustNormalization RobustNormalization
ShepardDensityScatter Shepard PDE scatter
Sheparddiagram Draws a Shepard Diagram
SignedLog Signed Log
Silhouetteplot Silhouette plot of classified data.
Slopechart Slope Chart
StatPDEdensity Pareto Density Estimation
Worldmap plots a world map by country codes
categoricalVariable A categorical Feature.
estimateDensity2D estimateDensity2D
stat_pde_density Calculate Pareto density estimation for ggplot2
plots
world_country_polygons
world_country_polygons
zplot Plotting for 3 dimensional data
Author(s)
Michael Thrun, Felix Pape, Onno Hansen-Goos, Alfred Ultsch
Maintainer: Michael Thrun <m.thrun@gmx.net>
References
[Thrun, 2018] Thrun, M. C.: Projection Based Clustering through Self-Organization and Swarm Intelligence, doctoral dissertation 2017, Springer, Heidelberg, ISBN: 978-3-658-20539-3, doi:10.1007/978-3-658-20540-9, 2018.
[Thrun/Ultsch, 2018] Thrun, M. C., & Ultsch, A. : Effects of the payout system of income taxes to municipalities in Germany, in Papiez, M. & Smiech,, S. (eds.), Proc. 12th Professor Aleksander Zelias International Conference on Modelling and Forecasting of Socio-Economic Phenomena, pp. 533-542, Cracow: Foundation of the Cracow University of Economics, Cracow, Poland, 2018.
[Thrun et al., 2020] Thrun, M. C., Gehlert, T. & Ultsch, A.: Analyzing the Fine Structure of Distributions, PLoS ONE, Vol. 15(10), pp. 1-66, DOI 10.1371/journal.pone.0238835, 2020.
Examples
data("Lsun3D")
Data=Lsun3D$Data
Pixelmatrix(Data)
InspectDistances(as.matrix(dist(Data)))
MAlist=MAplot(ITS,MTY)
data("Lsun3D")
Cls=Lsun3D$Cls
Data=Lsun3D$Data
#clear cluster structure
plot(Data[,1:2],col=Cls)
#However, the silhouette plot does not indicate a very good clustering in cluster 1 and 2
Silhouetteplot(Data,Cls = Cls)
Heatmap(as.matrix(dist(Data)),Cls = Cls)