mvhist {mvhist}R Documentation

Multivariate Histograms

Description

Tabulate and plot histograms for multivariate data, including directional histograms. This package used to be part of the mvmesh package, which works with multivariate meshes and grids. To simplify that package and make these functions more visible, this package was extracted as a self-standing package in Septemeber 2023. The functions provided can tally the number of data points in a list of regions in dimensions two or more. All regions/bins have flat sides.

Several different plots are available to show 2 and 3 dimensional data; and one can deal with dimension greater than 3. Plots in 3d can be rotated and zoomed in/out with the mouse, as well as resized.

Usage

histRectangular( x, breaks=10, plot.type="default", freq=TRUE, report="summary", ... )
histSimplex( x, S, plot.type="default", freq=TRUE, report="summary", ... )

Arguments

x

data in an (n x d) matrix; rows are d-dimensional data vectors

freq

TRUE for a frequency histogram, FALSE for a relative frequency histogram. See note about normalize.by.area

breaks

specifes the subdivision of the region; see 'breaks' in SolidRectangle in package mvhist.

plot.type

type of plot, see details below

S

(vps x d x nS) array of simplices in the V-representation, see V2Hrep in package mvhist. The vector S[,i,j] gives the coordinates of the i-th vertex of the j-th simplex.

report

level of warning messages; one of "summary", "all", "none".

...

Optional arguments to plot, e.g. color="red", etc.

Details

Calculate and plot multivariate histograms. histRectangular plots histogram based on a rectangular grid, while histSimplex plots histogram based on the simplices specified in S,

These routines use the functions and conventions of the package mvmesh. In particular, shapes can be described in two formats: vertex representation or half-space representation, respectively called the V-representation or H-representation. In all cases, the bins are simplices are converted to the H-representation and tallied by TallyHrep.

'plot.type' values depend on the type of plot being used. Possible values are:

Value

A plot is drawn (unless plot.type="none"). Note that the plots may be underneath/behind other windows; if you don't see a plot, search your desktop and/or the plot tab. A list is returned invisibly, with the following fields:

While counting data points in the different bins, two issues can arrise: (a) a data point is on the boundary of a bin, or (b) a data point is not in any of the specified bins. If report="none", no report is given about these issues. If report="summary", a count is given of the number of ties and the number of rejects. If report="all", the count of number of ties and rejects is given, and the indices (rows of the data matrix) of the rejected points are given.

Warning

These functions use double precision numbers by default, and most real numbers cannot be expressed exactly as doubles. So testing for being on a boundary is subject to the usual issues with floating point numbers. This is why the message "If you want correct answers, use rational arithmetic." is given when the package rcdd is loaded.

It is possible, but takes some work, to specify regions using only rational numbers as coordinates, and if the data is rational, you will be able to exactly specify regions and possibily boundaries. See the help for packages mvmesh and rcdd. Using rational coordinates has not be tested in this package.

Examples

mvhistRectangle.png

histRectangular example:

mvhistCircle.png

histSimplex example with plot.type="counts"

mvhistCircle2.png

histSimplex example with plot.type="pillars"

Examples

#  isotropic data in 2 and 3 dimensions
x2d <- matrix( rnorm(8000), ncol=2 )
x3d <- matrix( rnorm(9000), ncol=3 )

# 3d plots are in separate windows opened by the rgl package; you may have 
#    to search on the desktop to find those windows
if( interactive() ){

# save graphical parameters; restore to original value at end of examples
oldpar <- par()

# simple histogram of 2d data
histRectangular( x2d, breaks=5); title3d("2d, default plot.type" )

# simple histogram of 3d data: slices of data are stacked
histRectangular( x3d, breaks=4 ); title3d("3d, default plot.type" )

histRectangular( x2d, breaks=5, col='blue', plot.type="pillars" )
histRectangular( x2d, breaks=5, plot.type="counts" )
histRectangular( x2d, breaks=5, plot.type="index" )

# count number of data points in a triangle, using mvmesh function to define the partition
S1 <- 4*SolidSimplex( n=2, k=3 )$S
histSimplex( x2d, S1, plot.type="counts" )  # note many rejects
histSimplex( x2d, S1, col="green", lwd=3 ) # default plot.type="pillars"

# partiton a ball
S2 <- 4*UnitBall( n=2, k=2 )$S
histSimplex( x2d, S2, plot.type="counts", col="purple" )
histSimplex( x2d, S2, col="red" )

# Specify simplices explicitly to get specific region, e.g. restrict to x[1] >= 0
S1 <- matrix( c(0,0,  10,0,  0,10, 10,10), ncol=2, byrow=TRUE )  # first quadrant (bounded)
S2 <- matrix( c(0,0,  10,0,  0,-10,  10,-10), ncol= 2,, byrow=TRUE ) # fourth quadrant (bounded)
S <- array( c(S1,S2), dim=c(4,2,2) )
simp <- histSimplex( x2d, S, plot.type="counts" )
text(2,9, paste("nrejects=",simp$nrejects), col='red' )

# check behavior with rejects and ties
r <- histSimplex( x2d, S, plot.type="counts" )
str(r)  # see list of returned values
sum(c(r$counts,r$nrejects))

# restore user's graphical parameters
par(oldpar)
}

[Package mvhist version 1.1 Index]