strucplot {stripless} | R Documentation |
Structured Trellis Plots
Description
Structured Trellis Plots Without Strip Labels
Usage
strucplot(obj, ...)
## S3 method for class 'formula'
strucplot(obj, data = list(), groups = NULL,
xyLayout = list(), spacings = list(x = 0:9, y = 0:9), center = FALSE,
...)
## Default S3 method:
strucplot(...)
## S3 method for class 'data.frame'
strucplot(obj, col = "darkblue", groups = NULL, ...)
## S3 method for class 'list'
strucplot(obj, ...)
## S3 method for class 'matrix'
strucplot(obj, ...)
## S3 method for class 'lm'
strucplot(obj, newdata = model.frame(obj)[-1],
ylab = "Predicted Response", predictArgs = NULL, ...)
Arguments
obj |
Argument determining method dispatch. For the formula method, a
Note 1: Extended formulas and 3-d formulas (see
Note 2: For the lm method, the model object should contain a model
component. See It is preferable that this be the the first argument of the call. |
data |
For the formula method, a data frame (or more precisely, anything that is a valid envir argument in eval, e.g., a list or an environment) containing values for any variables in the formula, as well as groups and subset if applicable. If not found in data, or if data is unspecified, the variables are looked for in the environment of the formula. |
groups |
The |
xyLayout |
In its most general form, a list with named "x" and "y"
components, either or both of which can be missing (or NULL). If there are
n conditioning factors and both x and y are given, then
the combined components (e.g. via The integers in x specify the indices of the conditioning factors and their levels in the x direction hierarchy. Correspondingly for y. For a fuller explanation of how this controls the plot layout, see the Overview section and the examples. For convenience, the xyLayout argument can be specified in several other ways. The basic idea is that the full list will be filled in "appropriately" if possible. Specifically, this means:
As a result, all the following would produce exactly the same layout for n = 5 conditioning factors when used as the xyLayout argument:
Note that arbitrary integer vectors can be used, not just (ordered) sequences; e.g. the following are correct and equivalent when there are 6 conditioning factors:
But be careful: |
spacings |
A list with x and y components that are nondecreasing
sequences of positive vaues. The ith value in the sequence gives the spacing
in character heights between the sets of panels at each level for the ith
conditioning variable in the component's direction. If either a single
vector or a list with single component is given, it is replicated. See the
Overview for further explanation. Default is |
center |
Logical, default FALSE. If the conditioning factors constitute a 2-level design with a center point and center is TRUE, a compact trellis plot that omits panels for the missing settings of the factors is drawn. If FALSE or the design is not of this form, all panels are shown, which may be informative in any case to better visualize the design sparsity. |
col |
Fill color for the data frame method. |
newdata |
For the |
ylab |
For the |
predictArgs |
For the |
... |
Further arguments to panel functions, methods, or
|
Details
A structured display is a trellis display without strip labels. The factors and levels are encoded by plot layout and spacing and decoded by a separate legend. See the Overview below for a detailed explanation.
Value
The formula method returns an object of type c("structured", "trellis")
; or
c("doe","structured","trellis")
for 2-level designs with a center point. This is a
trellis object with additional structure
and formula
attributes. The lm
method returns a class of c("modelFit","structured","trellis")
. At present, this
extension is ignored, but future print/plot methods may take advantage of it. The other
methods return the same object as the formula method.
The structure
attribute provides xyLayout information. It is a list
with x
and y
components, each of which in turn is a list, one
of which may be empty (i.e. of length 0). The components of the x list have
names of the conditioning variables in the horizontal direction of the
layout with values the variable values; analogously for the y list.
The formula
attribute gives the actual formula used in the trellis
call, i.e. with the actual conditioning variable names substituted for ".".
Methods (by class)
-
formula
: Formula Method -
default
: Default method prints an error message -
data.frame
: Data frame method with no response to show design structure -
list
: Converts to a data frame and calls the data.frame method -
matrix
: Converts to a data frame and calls the data.frame method -
lm
: Plots predicted values for fitted models inheriting from class "lm"
Overview
The trellis display paradigm breaks down when there are more than 2 or 3 conditioning variables, because the plotting area becomes cluttered with multiple layers of strip labels identifying the panel settings. This is especially a problem if one wants to show the structure or results of studies that are (fractions of) factorial designs with the design factors as conditioning variables. For example, in industrial type experiments, it is not uncommon to have 5 or more experimental factors.
It is also often the case that there are multiple responses – e.g. several different characteristics of a product or a functional response like an IR spectrum, MRI scan, or surface plot. It can be desirable to display such results so that direct visual comparison of the complex responses can be made.
The strucplot
function enables such functionality by omitting the
strip labels and laying out the panels in a regular array, a 'xyLayout' in
which the position of the panels rather than their strip labels identifies
the variable settings.
Because the lattice package already has these capabilities, strucplot
is implemented as a wrapper to lattice's xyplot
function. This
provides a convenient interface, simplifies the code and, most
importantly, allows the user to access all of lattice's (and the underlying
grid graphics') functionality in the usual way. Only two extra arguments –
'xyLayout' and 'spacings' – are used to do this, although the default
'spacings' argument normally need not be changed. (There is also a third
'center' argument for 2 level factorial designs, which are commonly used in
industrial experiments, that is explained below.)
How does strucplot() work?
Suppose that the data consist of a numeric vector response y for 4
conditioning factors, f1, f2, f3, and f4, where f1 and f2 each have 2
levels, f3 has 3 levels, and f4 has 4 levels. (Because the conditioning
variables are coerced to factors by factor
, if level orderings other
than that given by these coercions are wanted, the user should do them
explicitly before calling strucplot
).
Then the call, strucplot(~y|f1*f2*f3*f4)
would produce a trellis plot
without strip labels with 12 (= 3 x 4) rows and 4 (= 2 x 2) columns of
panels, some of which may be empty if the corresponding factor settings are
missing. The default xyLayout argument that produces this is: xyLayout
= list(x = 1:2, y = 3:4)
. It splits the conditioning variables as evenly as
possible into 2 groups with the x component getting the first 2 variables,
f1 and f2, and the y component getting the second 2 variables, f3 and f4.
(if there are an odd number of variables, the x component gets one more variable than
the y). This means the levels of the x variables, f1 and f2, vary from left to right
across each row. For the y variables, if as.table = TRUE
, the default, the
levels of the y variables, f3 and f4, vary from top to bottom down each column;
otherwise if as.table = FALSE
, from bottom to top up each column. Since there
are 4 combinations of levels for the x variables and 12 for y, this gives a 12 row by
4 column display.
The panels are displayed in each direction in reverse lexicographic order,
where the 'alphabets' are the factor levels. This means that the first
variable changes the fastest; the second the next fastest, and so on. Using
(i,j) to denote setting in which the first factor is at the ith level and
the second is at the jth, this translates to (for as.table = TRUE
):
- Row ordering
(f1,f2): (1,1), (2,1), (1,2), (2,2)
- Column ordering from top down
(f3, f4): ((1,1), (2,1), (3,1), (1,2), (2,2), ... , (1,4), (2,4), (3,4)
If one component is missing, if it is x, there will only be 1 column; if it is y, only 1 row. The nonmissing component must still be correctly specified to provide the panel ordering.
Panel spacing
Variable spacing between the panels hierarchically groups them to identify their settings. The default spacing for both x and y directions is 0:9. This means that panels corresponding to the first, fastest changing, variable are separated by 0 units (= character heights); groups of panels at each fixed level of the second next fastest changing variable are separated by 1 unit; groups of groups of panels at fixed levels of the third are separated by 2 units; and so forth.
For the example, this means that the row spacing would look like ('X' indicates a panel): XX XX . And for columns it would be going down: XXX XXX XXX XXX . The spacings can be different for x and y, but this is usually unnecessary.
Effective xyLayout specification
The default layout is often enhanced by changing the order of the factors and the xyLayout; for example, ordering the factors from those with the least change among levels to the most, or vice-versa; or setting the "most important" factors along rows to facilitate visual comparison.
The order of the variables – and hence which vary in the x or y direction
– is given both by the left to right order of the conditioning in the
formula and the xyLayout argument. Thus, in the example, conditioning with
~|f3*f1*f2*f4
and setting the layout with xyLayout = list(x=4,
y=3:1)
is equivalent to ~|f4*f2*f1*f3
and xyLayout =
list(x=1,y=2:4)
and produces a display with 12 rows and 4 columns in which
the row panels now correspond to the 4 f4 levels and the column panels to
the levels of f2, f1, and f3. This redundancy is deliberate: it allows
changing layouts via the xyLayout argument to avoid rewriting a formula with
long names in a different order.
2 level designs with a center point – the 'center' argument
Finally, the 'center' argument, a logical with default = TRUE
,
controls the display when the conditioning factors are arranged as a 2 level
factorial design with a single "pseudo"-center point. "pseudo" here means
that the settings of the factors at the center need not be exactly in the
middle for numeric factors. If the design is not of this form, the 'center'
argument is ignored.
For such designs, when center = TRUE
, a more compact display will be
drawn in which a center panel corresponding to the center point is shown as
the single panel in its row and column, but all other empty panels
corresponding to settings where some of the conditioning variables are at
their mid levels and some are not, are omitted. Examples are given below.
If it is desired to show all the empty panels, which can be useful to
informatively represent the actual design sparsity, set center = FALSE
.
Note
Because 'xyLayout' and the number of levels in the conditioning variables
determine the plot structure, a 'layout' argument in the call will be
ignored. Other xyplot
arguments that are ignored are 'skip','between',
'drop.unused.levels', 'strip','perm.cond',and 'index.cond'. All other xyplot
should work as expected.
See Also
Examples
# Compare standard xyplot vs. strucplot of quakes data in datasets package.
# Cut depth into 6 groups and magitude into 5.
require(datasets)
# Note that as.table = TRUE is used to make depths increase
# down the page. For strucplot(), this is the default.
xyplot(lat ~ long|cut(mag,5)*cut(depth,6), data = quakes, col="blue",
as.table = TRUE,type = c("g","p"))
# Compare to:
strucplot(lat ~ long|cut(mag,5)*cut(depth,6), data = quakes, col="blue",
type = c("g", "p"))
# Visualizing designs:
# A half fraction of a 2^5 (a 2^(5-1)) design
# Build the design matrix
ff <- do.call(expand.grid,rep(list(c(-1,1)),4))
ff[[5]] <- do.call(mapply,c(FUN = prod,ff))
names(ff) <- LETTERS[1:5]
# Show the design
strucplot(ff)
# Plotting a 2 level design with a center point
# Add a center point to ff and plot
ffCenter <- rbind(ff,rep(0,5))
strucplot(ffCenter)
# Use center = TRUE for a more compact display and show legend below.
print(strucplot(ffCenter, center = TRUE),legendLoc = "bottom")
# The "npk" data. See help("npk") for details.
# Visualize design with blocks the vertical factor and the rest horizontal
strucplot(npk[,-5], xyLay = list(x = 2:4, y =1))
# Plot the yield
strucplot(~yield |., xyLay = list(x=2:4, y=1),data = npk, col = "darkblue",
panel = function(...){
panel.grid(h = -1, v = 0)
panel.xyplot(...)}
)
# It may be more informative to plot bars instead of points.
# See help(panel.bars) for details.
#
# Note also "shortcut" ways to specify the xyLayout
strucplot(~yield |., xyLay = list(x=2:4),data = npk,
panel = panel.bars)
# Include a conditioning variable in the formula to reduce the
# dimensionality of conditioning. Show legend on right of plot.
print(strucplot(yield ~ N|., xyLay = 2:3, data = npk,
panel = panel.bars), legendLoc = "right")
# Use the horizontal = TRUE argument of panel.bars to plot the bars
# horizontally. The left and right hand sides of the formula must also
# be switched for 2-sided formulas (not for 1-sided).
strucplot( N ~ yield |., xyLay = list(y=1), data = npk,
panel = panel.bars, horizontal = TRUE)
# Fit a linear model with all main effects and 2 factor interactions in N,P,K
# and plot the fits, using the "newdata" argument to plot predictions at
# non-design points).
require("stats")
npk.aov <- aov(yield ~ block + (N+P+K)^2, data = npk)
full <- do.call(expand.grid,lapply(npk[,-5],levels))
plot(strucplot(npk.aov, xyLay = list(x = 2:4),panel = panel.bars,
newdata = full),legendLoc = "bottom")
# Compare to a grouped plot:
ypred <- predict(npk.aov, new = full)
plot(
strucplot(ypred ~ N|K*block, groups = full$P, data = full,
panel= function(x,y, groups, subscripts,cex=1.25,...){
panel.grid(h=-1, v=0)
panel.superpose(x,y,cex= cex, type = c("p","l"),...,
panel.groups = panel.xyplot,
groups=groups, subscripts =subscripts)},
auto.key = list(points=FALSE,lines=TRUE, columns = 2,
title = "P",cex.title=1), ylab = "Predicted Response" ),
legendLoc = "right")
## Cleanup
rm(full, npk.aov, ypred,ff,ffCenter)