seawaveQPlots2 {seawaveQ}R Documentation

Internal function that generates plots of data and model results.

Description

seawaveQPlots2 is usually called from within fitMod but can be invoked directly. It generates plots of data and model results, as well as diagnostic plots, and returns the observed and predicted concentrations so that users may plot the concentrations using their own functions. This is the version for models that use restricted cubic splines.

Usage

seawaveQPlots2(
  stpars,
  cmaxt,
  tseas,
  tseaspr,
  tndrcs,
  tndrcspr,
  cdatsub,
  cavdat,
  cavmat,
  clog,
  centmp,
  yrstart,
  yrend,
  tyr,
  tyrpr,
  pnames,
  tanm,
  mclass = 2,
  numk,
  plotfile = FALSE
)

Arguments

stpars

is a matrix of information about the best seawaveQ model for the concentration data, see examplestpars.

cmaxt

is the decimal season of maximum chemical concentration.

tseas

is the decimal season of each concentration value in cdatsub.

tseaspr

is the decimal season date used to model concentration using the continuous data set cavdat.

tndrcs

is the decimal time centered on the midpoint of the trend for the sample data, cdatasub, then converted to a linear tail-restricted cubic spline with a particular number of knots (Harrell, 2010, 2018).

tndrcspr

is the decimal time centered on the midpoint of the trend for the continuous data, cavdat, then converted to a linear tail-restricted cubic spline using the knots from tndrcs.

cdatsub

is the concentration data.

cavdat

is the continuous (daily) ancillary data.

cavmat

is a matrix containing the continuous ancillary variables.

clog

is a vector of the base-10 logarithms of the concentration data.

centmp

is a logical vector indicating which concentration values are censored.

yrstart

is the starting year of the analysis (treated as January 1 of that year).

yrend

is the ending year of the analysis (treated as December 31 of that year).

tyr

is a vector of decimal dates for the concentration data.

tyrpr

is a vector of decimal dates for the continuous ancillary variables.

pnames

is the parameter (water-quality constituents) to analyze (if using USGS parameters, omit the starting 'P', such as "00945" for sulfate).

tanm

is a character identifier that names the trend analysis run. It is used to label output files.

mclass

indicates the class of model to use. A class 1 model is the traditional SEAWAVE-Q model that has a linear time trend. A class 2 model is a newer option for longer trend periods that uses a set of restricted cubic splines on the time variable to provide a more flexible model.

numk

is the number of knots in the restricted cubic spline model. The default is 4, and the recommended number is 3–7.

plotfile

is by default FALSE. True will write pdf files of plots to the user's file system.

Value

A PDF file (if plotfile is TRUE) containing plots of the data and modeled concentrations and regression diagnostic plots and a list containing the observed concentrations (censored and uncensored) and the predicted concentrations used for the plot.

Note

The plotting position used for representing censored values in the plots produced by seawaveQPlots2 is an important consideration for interpreting model fit. Plotting values obtained by using the censoring limit, or something smaller such as one-half of the censoring limit, produce plots that are difficult to interpret if there are a large number of censored values. Therefore, to make the plots more representative of diagnostic plots used for standard (non-censored) regression, a method for substituting randomized residuals in place of censored residuals was used. If a log-transformed concentration is censored at a particular limit, logC < L, then the residual for that concentration is censored as well, logC - fitted(logC) < L - fitted(logC) = rescen. In that case, a randomized residual was generated from a conditional normal distribution

resran <- scl * qnorm(runif(1) * pnorm(rescen / scl)),

where scl is the scale parameter from the survival regression model, pnorm is the R function for computing cumulative normal probabilities, runif is the R function for generating a random variable from the uniform distribution, and qnorm is the R function for computing quantiles of the normal distribution. Under the assumption that the model residuals are uncorrelated, normally distributed random variables with mean zero and standard deviation scl, the randomized residuals generated in this manner are an unbiased sample of the true (but unknown) residuals for the censored data. This is an application of the probability integral transform (Mood and others, 1974) to generate random variables from continuous distributions. The plotting position used for a censored concentration is fitted(logC) + resran. Note that each time a new model fit is performed, a new set of randomized residuals is generated and thus the plotting positions for censored values can change.

Author(s)

Aldo V. Vecchia and Karen R. Ryberg

References

Harrell, F.E., Jr., 2010, Regression modeling strategies—With applications to linear models, logistic regression, and survival analysis: New York, Springer-Verlag, 568 p.

Harrell, F.E., Jr., 2018, rms—Regression modeling strategies: R package version 5.1-2, https://CRAN.R-project.org/package=rms.

Mood, A.M., Graybill, F.A., and Boes, D.C., 1974, Introduction to the theory of statistics (3d ed.): New York, McGraw-Hill, Inc., 564 p.


[Package seawaveQ version 2.0.2 Index]