plot Stage.1 {NormData}R Documentation

Check the model assumptions for a fitted Stage 1 model graphically.

Description

This function provides several plots that are useful to evaluate model assumptions. When the plot() function is applied to a fitted Stage.1 object, three panels are generated. These panels show plots that can be used (i) to evaluate the homoscedasticity assumption, (ii) to evaluate the normality assumption, and (iii) to evaluate the presence of outliers.

Usage

## S3 method for class 'Stage.1'
plot(x, Homoscedasticity=TRUE, Normality=TRUE, 
Outliers=TRUE, Assume.Homoscedasticity, Add.Jitter=0, Seed=123, 
Confidence.QQ.Normality=.99, Plots.Together=TRUE, 
Y.Lim.ResVarFunction, Group.Spec.Densities.Delta=FALSE, Main.Homosced.1,
Main.Homosced.2, Main.Norm.1, Main.Norm.2, Main.Norm.3, Main.Outliers, 
cex.axis.homo=1, cex.main.homo=1, cex.lab.homo=1,  
cex.axis.norm=1.6, cex.main.norm=1.5, cex.lab.norm=1.5,  
cex.axis.outl=1, cex.main.outl=1, cex.lab.outl=1,  
Color="red", Loess.Span=0.75, verbose=TRUE, ...)

Arguments

x

A fitted object of class Stage.1.

Homoscedasticity

Logical. Should plots to evaluate homoscedasticity be shown?
Default Homoscedasticity=TRUE.

Normality

Logical. Should plots to evaluate the normality assumption be shown? The normality plots are based on the standardized residuals in the normative dataset, which are computed as explained in the Assume.Homoscedasticity= argument documentation below. Default Normality=TRUE.

Outliers

Logical. Should plots to evaluate outliers be shown? The outlier plot is based on the standardized residuals in the normative dataset, which are computed as explained in the Assume.Homoscedasticity= argument documentation below. Default Outliers=TRUE.

Assume.Homoscedasticity

By default, the standardized residuals \widehat{\delta}_i that are shown in the normality and outlier plots are computed based on the overall residual standard error when the homoscedasticity assumption is valid (i.e., as \widehat{\delta}_i = \frac{\widehat{\varepsilon}_i}{\widehat{\sigma}^2_{\varepsilon}}, with \widehat{\sigma}^2_{\varepsilon} corresponding to the overall residual standard error), or based on prediction-specific residual standard errors when the homoscedasticity assumption is invalid (i.e., as \widehat{\delta}_i = \frac{\widehat{\varepsilon}_i}{\widehat{\sigma}^2_{\varepsilon_i}}, with \widehat{\sigma}^2_{\varepsilon_i} corresponding to e.g., a cubic polynomial variance prediction function \widehat{\sigma}^2_{\varepsilon_i} = \widehat{\gamma}_0 + \widehat{\gamma}_1 \: \widehat{Y} + \widehat{\gamma}_2 \: \widehat{Y}^2 + {\gamma}_3 \: \widehat{Y}^3 when the mean structure of the model contains quantitiative independent variables). The default behaviour of the plot() function can be overruled using the Assume.Homoscedasticity argument. For example, when adding the argument Assume.Homoscedasticity=TRUE to the function call, the standardized residuals that are plotted will be computed based on the overall residual standard error (irrespective of the result of the Levene or Breusch-Pagan test).

Add.Jitter

The amount of jitter (random noise) that should be added to the X-axis of the homoscedasticity plots (which show the model-predicted mean values). Adding a bit of jitter is useful to show the data more clearly (especially when there are only a few unique predicted values, e.g., when a binary or non-binary qualitative independent variable is considered in the mean structure of the model), i.e., to avoid overlapping data points. The specified value Add.Jitter= in the function call determines the amount of jitter (range of values) that is added. For example, when Add.Jitter=0.1, a random value between -0.1 and 0.1 (sampled from a uniform) is added to the predicted values in the homoscedasticity plots (shown on the X-axis). Default Add.Jitter=0, i.e., no jitter added to the predicted values in the homoscedasticity plots.

Seed

The seed that is used when adding jitter. Default Seed=123.

Confidence.QQ.Normality

Specifies the desired confidence-level for the confidence band arond the line of perfect agreement/normality in the QQ-plot that is used to evaluate normality. Default Confidence.QQ.Normality=0.95. Use Confidence.QQ.Normality= FALSE if no confidence band is needed.

Plots.Together

The different homoscedasticity and normality plots are grouped together in a panel by default. For example, the three normality plots are shown together in one panel. If it is preferred to have the different plots in separate panels (rather than grouped to- gether), the argument Plots.Together=FALSE can be used. Default Plots.Together=TRUE.

Y.Lim.ResVarFunction

The min, max limits of the Y-axis that should be used for the variance function plot. By default, the limit of the Y-axis is set between 0 and the maximum value of estimated variances multiplied by 2. This can be changed using the Y.Lim.ResVarFunction argument. For example, adding the argument Y.Lim.ResVarFunction=c(0, 500) sets the range of the Y-axis of the variance function plot from 0 to 500.

Group.Spec.Densities.Delta

Logical. Should a plot with the group-specific densities of the standardized residuals be shown? Default Group.Spec.Densities.Delta=FALSE.

Main.Homosced.1

The title of the first panel of the homoscedasticity plot (i.e., the scatterplot of the residuals against the predicted scores).

Main.Homosced.2

The title of second panel of the homoscedasticity plot (i.e., the variance function plot).

Main.Norm.1

The title of the first panel of the normality plot (i.e., the histogram of the standardized residuals).

Main.Norm.2

The title of the second panel of the normality plot (i.e., the density of the standardized residuals and standard normal distribution).

Main.Norm.3

The title of the third panel of the normality plot (i.e., the QQ-plot).

Main.Outliers

The title of the outlier plot.

cex.axis.homo

The magnification to be used for axis annotation of the homoscedasticity plots.

cex.main.homo

The magnification to be used for the main label of the homoscedasticity plots.

cex.lab.homo

The magnification to be used for the X- and Y-axis labels of the homoscedasticity plots.

cex.axis.norm

The magnification to be used for axis annotation of the normality plots.

cex.main.norm

The magnification to be used for the main label of the normality plots.

cex.lab.norm

The magnification to be used for X and Y labels of the normality plots.

cex.axis.outl

The magnification to be used for axis annotation of the outlier plot.

cex.main.outl

The magnification to be used for the main label of the outlier plot.

cex.lab.outl

The magnification to be used for X- and Y-axis labels of the outlier plot.

Color

The color to be used for the Empirical Variance Function (EVF) and the standard normal distribution in the variance function plot and the normality plot that show the densities of the standardized residuals and the normal distribution, respectively. Default Color="red".

Loess.Span

The parameter \alpha that determines the degree of smoothing of the EVF that is shown in the variance function plot. Default Loess.Span=0.75.

verbose

A logical value indicating whether verbose output should be generated.

...

Other arguments to be passed.

Value

No return value, called for side effects.

Author(s)

Wim Van der Elst

References

Van der Elst, W. (2024). Regression-based normative data for psychological assessment: A hands-on approach using R. Springer Nature.

Examples

# Replicate the Stage 1 results that were obtained in 
# Case study 1 of Chapter 4 in Van der Elst (2023)
# ---------------------------------------------------
library(NormData)   # load the NormData package
data(GCSE)          # load the GCSE dataset

# Conduct the Stage 1 analysis
Model.1.GCSE <- Stage.1(Dataset=GCSE, 
  Model=Science.Exam~Gender)

summary(Model.1.GCSE)
plot(Model.1.GCSE, Add.Jitter = .2)

# Use blue color for EVF and density normal distribution
plot(Model.1.GCSE, Add.Jitter = .2, Color="blue")

# Change the title of the variance function plot into
# "Variance function plot, residuals Science exam"
plot(Model.1.GCSE, Add.Jitter = .2, 
  Main.Homosced.2 = "Variance function plot, residuals Science exam")

# Use a 95 percent CI around the line of perfect agreement in the
# QQ plot of normality
plot(Model.1.GCSE, Add.Jitter = .2, 
     Confidence.QQ.Normality = .9)


# Replicate the Stage 1 results that were obtained in 
# Case study 1 of Chapter 7 in Van der Elst (2023)
# ---------------------------------------------------
library(NormData)   # load the NormData package
data(Substitution)  # load the Substitution dataset

# Add the variable Age.C (= Age centered) to the Substitution dataset
Substitution$Age.C <- Substitution$Age - 50

# Fit the final Stage 1 model
Substitution.Model.9 <- Stage.1(Dataset=Substitution, 
   Alpha=0.005, Model=LDST~Age.C+LE,
   Order.Poly.Var=1) # Order.Poly.Var=1 specifies a linear polynomial
                     # for the variance prediction function

# Final Stage 1 model
summary(Substitution.Model.9)
plot(Substitution.Model.9) 

# Request a variance function plot that assumes that 
# the homoscedasticity assumption is valid
plot(Substitution.Model.9, Assume.Homoscedasticity = TRUE) 

[Package NormData version 1.1 Index]