histo {dvmisc}R Documentation

Histogram with Added Options

Description

Similar to base R function hist, but with two added features: (1) Can overlay one or more fitted probability density/mass functions (PDFs/PMFs) for any univariate distribution supported in R (see Distributions); and (2) Can generate more of a barplot type histogram, where each possible value gets its own bin centered over its value (useful for discrete variables with not too many possible values).

Usage

histo(x, dis = "none", dis_shift = NULL, integer_breaks = NULL,
  colors = rep("black", length(dis)), lty = 1:length(dis),
  legend_form = ifelse(length(dis) == 1, 0, 1), aic_decimals = 1,
  points_list = NULL, axis_list = NULL, legend_list = NULL, ...)

Arguments

x

Numeric vector of values.

dis

Character vector indicating which distributions should be used to add fitted PDF/PMF to the histogram. If not "none", choices for each element are:

"beta"

"binom" (must specify size)

"cauchy"

"chisq"

"exp"

"f"

"gamma"

"geom"

"hyper" (must specify total number of balls in urn, N, and number of balls drawn each time, k)

"lnorm"

"nbinom" (must specify size)

"norm"

"pois",

"t"

"unif"

"weibull"

dis_shift

Numeric value for shifting the fitted PDF/PMF along the x-axis of the histogram.

integer_breaks

If TRUE, integers covering the range of x are used for breaks, so there is one bin for each integer. Useful for discrete distributions that don't take on too many unique values.

colors

Character vector of colors for each PDF/PMF.

lty

Integer vector specifying line types for each curve.

legend_form

Integer value controlling what type of legend to include. Choices are 0 for no legend, 1 for legend naming each distribution, and 2 for legend naming each distribution and the corresponding AIC.

aic_decimals

Integer value for number of decimals for AIC.

points_list

Optional list of inputs to pass to points function, which is used to add the fitted PDF/PMF.

axis_list

Optional list of inputs to pass to axis.

legend_list

Optional list of inputs to pass to legend.

...

May include arguments to pass to hist and/or parameter values needed for certain distributions (size if dis = "binom" or dis = "nbinom", N and k if dis = "hyper").

Details

When x takes on whole numbers, you typically want to set dis_shift = -0.5 if right = TRUE (hist's default) and dis_shift = 0.5 if right = FALSE. The function will do this internally by default.

To illustrate, suppose a particular bin represents (7, 10]. Its midpoint will be at x = 8.5 on the graph. But if input values are whole numbers, this bin really only includes values of 8, 9, and 10, which have a mean of 9. So you really want f(9) to appear at x = 8.5. This requires shifting the curve to the left 0.5 units, i.e. setting dis_shift = -0.5.

When x takes on whole numbers with not too many unique values, you may want the histogram to show one bin for each integer. You can do this by setting integer_breaks = TRUE. By default, the function sets integer_breaks = TRUE if x contains whole numbers with 10 or fewer unique values.

Value

Histogram with fitted PDFs/PMFs if requested.

Examples

# Sample 10,000 Poisson(2) values and commpare default hist vs. histo
set.seed(123)
x <- rpois(n = 10000, lambda = 2)
par(mfrow = c(1, 2))
hist(x, main = "hist function")
histo(x, main = "histo function")

# Sample 10,000 lognormal(0, 0.35) values. Create histogram with curves
# showing fitted lognormal, normal, and Gamma PDFs.
set.seed(123)
x <- rlnorm(n = 10000, meanlog = 0, sdlog = 0.35)
par(mfrow = c(1, 1))
histo(x, c("lnorm", "norm", "gamma"), main = "X ~ Lognormal(0, 0.35)")

# Generate 10,000 Binomial(8, 0.25) values. Create histogram, specifying
# size = 5, with blue line/points showing fitted PMF.
set.seed(123)
x <- rbinom(n = 10000, size = 5, prob = 0.25)
par(mfrow = c(1, 1))
histo(x, dis = "binom", size = 5, colors = "blue", 
      points_list = list(type = "b"))


[Package dvmisc version 1.1.4 Index]