R: Internal function that generates plots of data and model...

seawaveQPlots2 {seawaveQ}

R Documentation

Internal function that generates plots of data and model results.

Description

seawaveQPlots2 is usually called from within fitMod but can be invoked directly. It generates plots of data and model results, as well as diagnostic plots, and returns the observed and predicted concentrations so that users may plot the concentrations using their own functions. This is the version for models that use restricted cubic splines.

Usage

seawaveQPlots2(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndrcs,
  tndrcspr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 2,
  numk,
  plotfile = FALSE
)

Arguments

`stpars`	is a matrix of information about the best seawaveQ model for the concentration data, see `examplestpars`.
`cmaxt`	is the decimal season of maximum chemical concentration.
`tseas`	is the decimal season of each concentration value in cdatsub.
`tseaspr`	is the decimal season date used to model concentration using the continuous data set cavdat.
`tndrcs`	is the decimal time centered on the midpoint of the trend for the sample data, cdatasub, then converted to a linear tail-restricted cubic spline with a particular number of knots (Harrell, 2010, 2018).
`tndrcspr`	is the decimal time centered on the midpoint of the trend for the continuous data, cavdat, then converted to a linear tail-restricted cubic spline using the knots from tndrcs.
`cdatsub`	is the concentration data.
`cavdat`	is the continuous (daily) ancillary data.
`cavmat`	is a matrix containing the continuous ancillary variables.
`clog`	is a vector of the base-10 logarithms of the concentration data.
`centmp`	is a logical vector indicating which concentration values are censored.
`yrstart`	is the starting year of the analysis (treated as January 1 of that year).
`yrend`	is the ending year of the analysis (treated as December 31 of that year).
`tyr`	is a vector of decimal dates for the concentration data.
`tyrpr`	is a vector of decimal dates for the continuous ancillary variables.
`pnames`	is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).
`tanm`	is a character identifier that names the trend analysis run. It is used to label output files.
`mclass`	indicates the class of model to use. A class 1 model is the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model.
`numk`	is the number of knots in the restricted cubic spline model. The default is 4, and the recommended number is 3–7.
`plotfile`	is by default FALSE. True will write pdf files of plots to the user's file system.

Value

A PDF file (if plotfile is TRUE) containing plots of the data and modeled concentrations and regression diagnostic plots and a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used for the plot.

Note

The plotting position used for representing censored values in the plots produced by seawaveQPlots2 is an important consideration for interpreting model fit. Plotting values obtained by using the censoring limit, or something smaller such as one-half of the censoring limit, produce plots that are difficult to interpret if there are a large number of censored values. Therefore, to make the plots more representative of diagnostic plots used for standard (non-censored) regression, a method for substituting randomized residuals in place of censored residuals was used. If a log-transformed concentration is censored at a particular limit, logC < L, then the residual for that concentration is censored as well, logC - fitted(logC) < L - fitted(logC) = rescen. In that case, a randomized residual was generated from a conditional normal distribution

resran <- scl * qnorm(runif(1) * pnorm(rescen / scl)),

where scl is the scale parameter from the survival regression model, pnorm is the R function for computing cumulative normal probabilities, runif is the R function for generating a random variable from the uniform distribution, and qnorm is the R function for computing quantiles of the normal distribution. Under the assumption that the model residuals are uncorrelated, normally distributed random variables with mean zero and standard deviation scl, the randomized residuals generated in this manner are an unbiased sample of the true (but unknown) residuals for the censored data. This is an application of the probability integral transform (Mood and others, 1974) to generate random variables from continuous distributions. The plotting position used for a censored concentration is fitted(logC) + resran. Note that each time a new model fit is performed, a new set of randomized residuals is generated and thus the plotting positions for censored values can change.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Harrell, F.E., Jr., 2010, Regression modeling strategies—With applications to linear models, logistic regression, and survival analysis: New York, Springer-Verlag, 568 p.

Harrell, F.E., Jr., 2018, rms—Regression modeling strategies: R package version 5.1-2, https://CRAN.R-project.org/package=rms.

Mood, A.M., Graybill, F.A., and Boes, D.C., 1974, Introduction to the theory of statistics (3d ed.): New York, McGraw-Hill, Inc., 564 p.

[Package seawaveQ version 2.0.2 Index]