IASD {IASD} | R Documentation |
Model Selection for Index of Asymmetry Distribution
Description
Calculate AIC's and AICc's of unimodal model (one normal distribution) and bimodal model(a mixture of two normal distributions) which fit the distribution of indices of asymmetry (IAS), and plot their density, to help determine IAS distribution is unimodal or bimodal.
Usage
IASD(df, dfCols = NA, fixSignApproximation = FALSE,
plotGraph = TRUE, plotToScreen = FALSE, filePrefix = NA,
xlimMin = NA, xlimMax = NA, ylimMin = 0, ylimMax = NA,
dHist = NA, dFunc = NA, meanStartSymmetric = NA,
sdStartSymmetric = NA, meanStartAsymmetric = NA,
sdStartAsymmetric = NA, positiveRatioStartAsymmetric = NA,
plotSelect = rep(TRUE, 4), showLegend = TRUE,
modelName = c("FA", "DA", "AS", "Skewed AS"), xlab = NA,
ylab = NA, main = NA, freqAxis = FALSE, lineColor = "black",
nsmall = 2, fileType = "TEXT", generateFiles = TRUE, ...)
Arguments
df |
data frame containing the data to be investigated. |
dfCols |
Columns in df to be processed. If NA, they are from the second to the last columns for multi-column data frame and the sole column for single-column data frame. |
fixSignApproximation |
If TRUE, parameters of normal distributions are determined only by using absolute values, rather than MLE. Each column can be separately controlled by using vector value. |
plotGraph |
If TRUE, histograms and density plots are plotted and saved to PDF, if FALSE, they are not plotted. |
plotToScreen |
If TRUE, plotted graphs are also shown in the screen. |
filePrefix |
File names of saved plots and AIC, AICc table files start with this value. |
xlimMin |
Minimum of plot range. If NA, it is determined from the data. Each column can be separately controlled by using vector value. |
xlimMax |
Maximum of plot range. If NA, it is determined from the data. Each column can be separately controlled by using vector value. |
dHist |
Width of histogram bars. If NA, it is one 20th of the plot range. Each column can be separately controlled by using vector value. |
dFunc |
Broken line step for the plot of density functions. If NA, it is one 200th of the plot range. Each column can be separately controlled by using vector value. |
ylimMin |
Minimum of vertical axis of plots. If NA, it is determined by "hist()" function of R. Each column can be separately controlled by using vector value. |
ylimMax |
Maximum of vertical axis of plots. If NA, it is determined by "hist()" function of R. Each column can be separately controlled by using vector value. |
meanStartSymmetric |
Start value of mean for mle() in bimodal symmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value. |
sdStartSymmetric |
Start value of sd for mle() in bimodal symmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value. |
meanStartAsymmetric |
Start value of mean for mle() in bimodal asymmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value. |
sdStartAsymmetric |
Start value of sd for mle() in bimodal asymmetric model. If NA, it is calculated by using absolute values of the data. Each column can be separately controlled by using vector value. |
positiveRatioStartAsymmetric |
Start value of positiveRatio for mle() in bimodal asymmetric model. If NA, it is ratio of positive data. Each column can be separately controlled by using vector value. |
plotSelect |
Indicate which model's density graph is plotted. |
showLegend |
If TRUE, legend of the graph is drawn. |
modelName |
Name of four models. |
xlab |
Label of x axis. If NA, name of column is used. Each column can be separately controlled by using vector value. |
ylab |
Label of y axis. If NA, "Density" is used. Each column can be separately controlled by using vector value. |
main |
Title of graph. If NA, "Histogram of (column name)" is used. Each column can be separately controlled by using vector value. |
freqAxis |
If TRUE axis for frequency is drawn on right. Each column can be separately controlled by using vector value. |
lineColor |
Color of density graphs. Four density graphs can be separately controlled by using vector value. If first two color are same, line pattern changes for each density graph. |
nsmall |
The number of digits to the right of decimal points for AIC and AICc. |
fileType |
Type of output files for calculation results of AIC and AICc. If "TEXT", output files are tab separated text file. If "CSV", they are CSV file. |
generateFiles |
Do not use this option. If generateFiles is FALSE, no files are generated. This option is to avoid strict check of CRAN. |
... |
Other parameters are passed to hist() function. |
Details
Calculate AIC and AICc for the following four models and plot their densities.
unimodal symmetric distribution (normal distribution with mean = 0)
N(0, sd^2)unimodal asymmetric distribution (normal distribution)
N(mean, sd^2)bimodal symmetric distribution (mixture of two normal distributions with opposite sign of mean but same absolute values and weights)
0.5*N(mean, sd^2) + 0.5*N(- mean, sd^2)bimodal asymmetric distribution (weighted mixture of two normal distributions with opposite sign of mean and the same absolute values)
positiveRatio*N(mean, sd^2) + (1 - positiveRatio)*N(- mean, sd^2)
Tables of AIC and AICc are saved as tab separated text file or CSV file, depending of fileType argument. Histogram and model densities plot are saved for each column.
If the start values for mle() (meanStartSymmetric, sdStartSymmetric, meanStartAsymmetric, sdStartAsymmetric, positiveRatioStartAsymmetric) are inappropriate values, mle() does not work properly. If they are not assigned (NA), mean and sd are those of absolute values of the data, and positiveRatio is the ratio of positive data.
Value
AIC |
AIC (Akaike's information criterion) |
AICc |
AICc (AIC with a correction for finite sample sizes) |
modelName[1] |
list for the unimodal symmetric model |
modelName[2] |
list for the unimodal asymmetric model |
modelName[3] |
list for the bimodal symmetric model |
modelName[4] |
list for the bimodal asymmetric model |
mean |
estimated value of mean |
sd |
estimated value of sd |
positiveRatio |
estimated value of positiveRatio |
f |
density function |
Author(s)
Satoshi Takahashi
Examples
df = data.frame(ID = c(1:5), IAS = c(8.3, 12.7, -12.7, -7.3, -8.1),
IAS2 = c(14.2, 8.8, -12.7, -8.6, -10.5),
IAS3 = c(1.04, 1.28, -0.78, -0.84, -0.85))
# Do not use the option 'generateFiles = FALSE', in the following IASD commands.
result = IASD(df, generateFiles = FALSE) # calculate AIC's and AICc's
result = IASD(df, cols = c(2,4), plotGraph = FALSE, generateFiles = FALSE)
# use data in the second and fourth columns, do not plot graphs
result = IASD(df, filePrefix="P.microlepis", xlimMin = -15,
xlimMax = 15, dHist = c(1, 1, 0.1), generateFiles = FALSE)
# file name of each plot starts with "P.microlepis", plot range
# and width of histgram bar is changed