find_best_fits {forceR} | R Documentation |
Find Best Polynomial Fits for Curves
Description
Calculates best model fits for all curves based on AIC criterion. The function fits polynomial functions with 1 to 20 coefficients and uses the Akaike Information
Criterion (AIC) to evaluate the goodness of the fits. A model is considered a good fit, when the percentage of change from one model to the next (e.g. a model with
6 coefficients to a model with 7 coefficients) is, e.g. < 5%
when threshold = 5
. The first for models meeting this criterion are plotted as colored graphs and the AICs of these models
are visualized in a second plot for each curve. All first four coefficients per curve that fulfill the criterion are stored and in the end, a histogram of how
often which coefficients were good fits is plotted as well. The function returns the numerical value of the coefficient that fulfilled the criterion of a good fit
in most curves.
Usage
find_best_fits(
df,
degrees = 1:20,
threshold = 5,
zero_threshold = NULL,
plot.to.screen = FALSE,
path.data = NULL,
path.plots = NULL,
show.progress = FALSE
)
Arguments
df |
The resulting tibble of the function |
degrees |
Numerical vector of polynomial degrees to test. Cannot be infinitely high - if two high, throws error: |
threshold |
Percentage of AIC change compared to previous degree to fit the good-fit-criteria (s.a.). Default: |
zero_threshold |
Either numerical or |
plot.to.screen |
A logical value indicating if results should be
plotted in the current R plot device. Default: |
path.data |
A string character defining where to save the results. If |
path.plots |
A string character defining where to save the plots. If |
show.progress |
A logical value indicating if progress should be
printed to the console. Default: |
Details
#' This function expects a tibble made of three columns as df
: species
containing the species names,
index
numerical column, e.g. time (but can be arbitrary continuous unit), for each species,
and force.norm.100
containing the averaged and rescaled curve of each species.
Value
Returns the a numerical value representing the number of coefficient that was most often under the first 4 models that were followed by an
AIC-change <= 5%
by the next model. Additionally, plots showing the model fits and a histogram of the coefficients that met the 5%-criterion can be
plotted to the plot device or saved as PDFs in path.plots
.
Examples
# Using the forceR::peaks.df.100.avg dataset:
# find smallest polynomial degree that best describes all curves
best.fit.poly <- find_best_fits(df = forceR::peaks.df.100.avg)
best.fit.poly