seawaveQPlots2 {seawaveQ} | R Documentation |
Internal function that generates plots of data and model results.
Description
seawaveQPlots2 is usually called from within fitMod
but
can be invoked directly. It generates plots of data and model results,
as well as diagnostic plots, and returns the observed and predicted
concentrations so that users may plot the concentrations using
their own functions. This is the version for models that use
restricted cubic splines.
Usage
seawaveQPlots2(
stpars,
cmaxt,
tseas,
tseaspr,
tndrcs,
tndrcspr,
cdatsub,
cavdat,
cavmat,
clog,
centmp,
yrstart,
yrend,
tyr,
tyrpr,
pnames,
tanm,
mclass = 2,
numk,
plotfile = FALSE
)
Arguments
stpars |
is a matrix of information about the best seawaveQ model
for the concentration data, see |
cmaxt |
is the decimal season of maximum chemical concentration. |
tseas |
is the decimal season of each concentration value in cdatsub. |
tseaspr |
is the decimal season date used to model concentration using the continuous data set cavdat. |
tndrcs |
is the decimal time centered on the midpoint of the trend for the sample data, cdatasub, then converted to a linear tail-restricted cubic spline with a particular number of knots (Harrell, 2010, 2018). |
tndrcspr |
is the decimal time centered on the midpoint of the trend for the continuous data, cavdat, then converted to a linear tail-restricted cubic spline using the knots from tndrcs. |
cdatsub |
is the concentration data. |
cavdat |
is the continuous (daily) ancillary data. |
cavmat |
is a matrix containing the continuous ancillary variables. |
clog |
is a vector of the base-10 logarithms of the concentration data. |
centmp |
is a logical vector indicating which concentration values are censored. |
yrstart |
is the starting year of the analysis (treated as January 1 of that year). |
yrend |
is the ending year of the analysis (treated as December 31 of that year). |
tyr |
is a vector of decimal dates for the concentration data. |
tyrpr |
is a vector of decimal dates for the continuous ancillary variables. |
pnames |
is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate). |
tanm |
is a character identifier that names the trend analysis run. It is used to label output files. |
mclass |
indicates the class of model to use. A class 1 model is the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model. |
numk |
is the number of knots in the restricted cubic spline model. The default is 4, and the recommended number is 3–7. |
plotfile |
is by default FALSE. True will write pdf files of plots to the user's file system. |
Value
A PDF file (if plotfile is TRUE) containing plots of the data and modeled concentrations and regression diagnostic plots and a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used for the plot.
Note
The plotting position used for representing censored values in
the plots produced by seawaveQPlots2
is an important
consideration for interpreting model fit. Plotting values obtained by
using the censoring limit, or something smaller such as one-half of the
censoring limit, produce plots that are difficult to interpret if there
are a large number of censored values. Therefore, to make the plots
more representative of diagnostic plots used for standard
(non-censored) regression, a method for substituting randomized
residuals in place of censored residuals was used. If a
log-transformed concentration is censored at a particular limit,
logC < L
, then the residual for that concentration is censored
as well, logC - fitted(logC) < L - fitted(logC) = rescen
. In
that case, a randomized residual was generated from a conditional
normal distribution
resran <- scl * qnorm(runif(1) * pnorm(rescen / scl))
,
where scl
is the scale parameter from the survival regression model,
pnorm
is the R function for computing cumulative normal
probabilities, runif
is the R function for generating a
random variable from the uniform distribution, and qnorm
is the R function for computing quantiles of the normal distribution.
Under the assumption that the model residuals are uncorrelated,
normally distributed random variables with mean zero and standard
deviation scl
, the randomized residuals generated in this manner are an
unbiased sample of the true (but unknown) residuals for the censored
data. This is an application of the probability integral transform
(Mood and others, 1974) to generate random variables from continuous
distributions. The plotting position used for a censored concentration is
fitted(logC) + resran
. Note that each time a new model fit is
performed, a new set of randomized residuals is generated and thus the
plotting positions for censored values can change.
Author(s)
Aldo V. Vecchia and Karen R. Ryberg
References
Harrell, F.E., Jr., 2010, Regression modeling strategies—With applications to linear models, logistic regression, and survival analysis: New York, Springer-Verlag, 568 p.
Harrell, F.E., Jr., 2018, rms—Regression modeling strategies: R package version 5.1-2, https://CRAN.R-project.org/package=rms.
Mood, A.M., Graybill, F.A., and Boes, D.C., 1974, Introduction to the theory of statistics (3d ed.): New York, McGraw-Hill, Inc., 564 p.