plot_sim_cdf {SimMultiCorrData}R Documentation

Plot Simulated (Empirical) Cumulative Distribution Function for Continuous, Ordinal, or Count Variables

Description

This plots the cumulative distribution function of simulated continuous, ordinal, or count data using the empirical cdf Fn (see stat_ecdf). Fn is a step function with jumps i/n at observation values, where i is the number of tied observations at that value. Missing values are ignored. For observations y = (y1, y2, ..., yn), Fn is the fraction of observations less or equal to t, i.e., Fn(t) = sum[yi <= t]/n. If calc_cprob = TRUE and the variable is continuous, the cumulative probability up to y = delta is calculated (see sim_cdf_prob) and the region on the plot is filled with a dashed horizontal line drawn at Fn(delta). The cumulative probability is stated on top of the line. This fill option does not work for ordinal or count variables. The function returns a ggplot2-package object so the user can modify as necessary. The graph parameters (i.e. title, color, fill, hline) are ggplot2-package parameters. It works for valid or invalid power method pdfs.

Usage

plot_sim_cdf(sim_y, title = "Empirical Cumulative Distribution Function",
  ylower = NULL, yupper = NULL, calc_cprob = FALSE, delta = 5,
  color = "dark blue", fill = "blue", hline = "dark green",
  text.size = 11, title.text.size = 15, axis.text.size = 10,
  axis.title.size = 13)

Arguments

sim_y

a vector of simulated data

title

the title for the graph (default = "Empirical Cumulative Distribution Function")

ylower

the lower y value to use in the plot (default = NULL, uses minimum simulated y value)

yupper

the upper y value (default = NULL, uses maximum simulated y value)

calc_cprob

if TRUE (default = FALSE) and sim_y is continuous, sim_cdf_prob is used to find the empirical cumulative probability up to y = delta and the region on the plot is filled with a dashed horizontal line drawn at Fn(delta)

delta

the value y at which to evaluate the cumulative probability (default = 5)

color

the line color for the cdf (default = "dark blue")

fill

the fill color if calc_cprob = TRUE (default = "blue)

hline

the dashed horizontal line color drawn at delta if calc_cprob = TRUE (default = "dark green")

text.size

the size of the text displaying the cumulative probability up to delta if calc_cprob = TRUE

title.text.size

the size of the plot title

axis.text.size

the size of the axes text (tick labels)

axis.title.size

the size of the axes titles

Value

A ggplot2-package object.

References

Please see the references for plot_cdf.

Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.

See Also

ecdf, sim_cdf_prob, ggplot2-package, stat_ecdf, geom_abline, geom_ribbon

Examples

## Not run: 
# Logistic Distribution: mean = 0, variance = 1
seed = 1234

# Find standardized cumulants
stcum <- calc_theory(Dist = "Logistic", params = c(0, 1))

# Simulate without the sixth cumulant correction
# (invalid power method pdf)
Logvar1 <- nonnormvar1(method = "Polynomial", means = 0, vars = 1,
                      skews = stcum[3], skurts = stcum[4],
                      fifths = stcum[5], sixths = stcum[6], seed = seed)

# Plot cdf with cumulative probability calculated up to delta = 5
plot_sim_cdf(sim_y = Logvar1$continuous_variable,
             title = "Invalid Logistic Empirical CDF",
             calc_cprob = TRUE, delta = 5)

# Simulate with the sixth cumulant correction
# (valid power method pdf)
Logvar2 <- nonnormvar1(method = "Polynomial", means = 0, vars = 1,
                      skews = stcum[3], skurts = stcum[4],
                      fifths = stcum[5], sixths = stcum[6],
                      Six = seq(1.5, 2, 0.05), seed = seed)

# Plot cdf with cumulative probability calculated up to delta = 5
plot_sim_cdf(sim_y = Logvar2$continuous_variable,
             title = "Valid Logistic Empirical CDF",
             calc_cprob = TRUE, delta = 5)

# Simulate one binary and one ordinal variable (4 categories) with
# correlation 0.3
Ordvars = rcorrvar(k_cat = 2, marginal = list(0.4, c(0.2, 0.5, 0.7)),
                   rho = matrix(c(1, 0.3, 0.3, 1), 2, 2), seed = seed)

# Plot cdf of 2nd variable
plot_sim_cdf(Ordvars$ordinal_variables[, 2])


## End(Not run)


[Package SimMultiCorrData version 0.2.2 Index]