plot_sim_theory {SimMultiCorrData} | R Documentation |
Plot Simulated Data and Target Distribution Data by Name or Function for Continuous or Count Variables
Description
This plots simulated continuous or count data and overlays data (if overlay
= TRUE) generated from the target
distribution, which is specified by name (plus up to 4 parameters) or pdf function fx
(plus support bounds).
Due to the integration involved in evaluating the cdf using fx
, only continuous fx
may be supplied. Both are plotted
as histograms. If a continuous target distribution is specified (cont_var = TRUE
), the simulated data y
is
scaled and then transformed (i.e. y = sigma * scale(y) + mu
) so that it has the same mean (mu
) and variance (sigma^2
) as the
target distribution. If the variable is Negative Binomial, the parameters must be size and success probability (not mu).
It returns a ggplot2-package
object so the user can modify as necessary.
The graph parameters (i.e. title
, power_color
, target_color
,
target_lty
) are ggplot2-package
parameters. It works for valid or invalid power method pdfs.
Usage
plot_sim_theory(sim_y, title = "Simulated Data Values", ylower = NULL,
yupper = NULL, power_color = "dark blue", overlay = TRUE,
cont_var = TRUE, target_color = "dark green", nbins = 100,
Dist = c("Benini", "Beta", "Beta-Normal", "Birnbaum-Saunders", "Chisq",
"Dagum", "Exponential", "Exp-Geometric", "Exp-Logarithmic", "Exp-Poisson",
"F", "Fisk", "Frechet", "Gamma", "Gaussian", "Gompertz", "Gumbel",
"Kumaraswamy", "Laplace", "Lindley", "Logistic", "Loggamma", "Lognormal",
"Lomax", "Makeham", "Maxwell", "Nakagami", "Paralogistic", "Pareto", "Perks",
"Rayleigh", "Rice", "Singh-Maddala", "Skewnormal", "t", "Topp-Leone",
"Triangular", "Uniform", "Weibull", "Poisson", "Negative_Binomial"),
params = NULL, fx = NULL, lower = NULL, upper = NULL, seed = 1234,
sub = 1000, legend.position = c(0.975, 0.9), legend.justification = c(1,
1), legend.text.size = 10, title.text.size = 15, axis.text.size = 10,
axis.title.size = 13)
Arguments
sim_y |
a vector of simulated data |
title |
the title for the graph (default = "Simulated Data Values") |
ylower |
the lower y value to use in the plot (default = NULL, uses minimum simulated y value) |
yupper |
the upper y value (default = NULL, uses maximum simulated y value) |
power_color |
the histogram fill color for the simulated variable (default = "dark blue") |
overlay |
if TRUE (default), the target distribution is also plotted given either a distribution name (and parameters) or pdf function fx (with support bounds = lower, upper) |
cont_var |
TRUE (default) for continuous variables, FALSE for count variables |
target_color |
the histogram fill color for the target distribution (default = "dark green") |
nbins |
the number of bins to use when creating the histograms (default = 100) |
Dist |
name of the distribution. The possible values are: "Benini", "Beta", "Beta-Normal", "Birnbaum-Saunders", "Chisq",
"Exponential", "Exp-Geometric", "Exp-Logarithmic", "Exp-Poisson", "F", "Fisk", "Frechet", "Gamma", "Gaussian", "Gompertz",
"Gumbel", "Kumaraswamy", "Laplace", "Lindley", "Logistic", "Loggamma", "Lognormal", "Lomax", "Makeham", "Maxwell",
"Nakagami", "Paralogistic", "Pareto", "Perks", "Rayleigh", "Rice", "Singh-Maddala", "Skewnormal", "t", "Topp-Leone", "Triangular",
"Uniform", "Weibull", "Poisson", and "Negative_Binomial".
Please refer to the documentation for each package (either |
params |
a vector of parameters (up to 4) for the desired distribution (keep NULL if |
fx |
a pdf input as a function of x only, i.e. fx <- function(x) 0.5*(x-1)^2; must return a scalar
(keep NULL if |
lower |
the lower support bound for a supplied fx, else keep NULL (note: if an error is thrown from |
upper |
the upper support bound for a supplied fx, else keep NULL (note: if an error is thrown from |
seed |
the seed value for random number generation (default = 1234) |
sub |
the number of subdivisions to use in the integration to calculate the cdf from fx; if no result, try increasing sub (requires longer computation time; default = 1000) |
legend.position |
the position of the legend |
legend.justification |
the justification of the legend |
legend.text.size |
the size of the legend labels |
title.text.size |
the size of the plot title |
axis.text.size |
the size of the axes text (tick labels) |
axis.title.size |
the size of the axes titles |
Value
A ggplot2-package
object.
References
Please see the references for plot_cdf
.
Wickham H. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2009.
See Also
calc_theory
,
ggplot2-package
, geom_histogram
Examples
## Not run:
# Logistic Distribution: mean = 0, variance = 1
seed = 1234
# Find standardized cumulants
stcum <- calc_theory(Dist = "Logistic", params = c(0, 1))
# Simulate without the sixth cumulant correction
# (invalid power method pdf)
Logvar1 <- nonnormvar1(method = "Polynomial", means = 0, vars = 1,
skews = stcum[3], skurts = stcum[4],
fifths = stcum[5], sixths = stcum[6],
n = 10000, seed = seed)
# Plot simulated variable (invalid) and data from theoretical distribution
plot_sim_theory(sim_y = Logvar1$continuous_variable,
title = "Invalid Logistic Simulated Data Values",
overlay = TRUE, Dist = "Logistic", params = c(0, 1),
seed = seed)
# Simulate with the sixth cumulant correction
# (valid power method pdf)
Logvar2 <- nonnormvar1(method = "Polynomial", means = 0, vars = 1,
skews = stcum[3], skurts = stcum[4],
fifths = stcum[5], sixths = stcum[6],
Six = seq(1.5, 2, 0.05), n = 10000, seed = seed)
# Plot simulated variable (valid) and data from theoretical distribution
plot_sim_theory(sim_y = Logvar2$continuous_variable,
title = "Valid Logistic Simulated Data Values",
overlay = TRUE, Dist = "Logistic", params = c(0, 1),
seed = seed)
# Simulate 2 Negative Binomial distributions and correlation 0.3
# using Method 1
NBvars <- rcorrvar(k_nb = 2, size = c(10, 15), prob = c(0.4, 0.3),
rho = matrix(c(1, 0.3, 0.3, 1), 2, 2), seed = seed)
# Plot pdfs of 1st simulated variable and theoretical distribution
plot_sim_theory(sim_y = NBvars$Neg_Bin_variable[, 1], overlay = TRUE,
cont_var = FALSE, Dist = "Negative_Binomial",
params = c(10, 0.4))
## End(Not run)