R: Takes a formula and constructs a 'pstructure.object' that...

pstructure.formula {dae}

R Documentation

Takes a formula and constructs a `pstructure.object` that includes the orthogonalized projectors for the terms in a formula

Description

Constructs a pstructure.object that includes a set of mutually orthogonal projectors, one for each term in the formula. These are used to specify a structure, or an orthogonal decomposition of the data space. There are three methods available for orthogonalizing the projectors corresponding to the terms in the formula: differencing, eigenmethods or the default hybrid method.

It is possible to use this function to find out what sources are associated with the terms in a model and to determine the marginality between terms in the model. The marginality matrix can be saved.

Usage

## S3 method for class 'formula'
pstructure(formula, keep.order = TRUE, grandMean = FALSE, 
           orthogonalize = "hybrid", labels = "sources", 
           marginality = NULL, check.marginality = TRUE, 
           omit.projectors = FALSE, 
           which.criteria = c("aefficiency","eefficiency","order"), 
           aliasing.print = TRUE, data = NULL, ...)

Arguments

`formula`	An object of class `formula` from which the terms will be obtained.
`keep.order`	A `logical` indicating whether the terms should keep their position in the expanded `formula` projector, or reordered so that main effects precede two-factor interactions, which precede three-factor interactions and so on.
`grandMean`	A `logical` indicating whether the projector for the grand mean is to be included in the set produced.
`orthogonalize`	A `character` vector indicating the method for orthogonalizing a projector to those for terms that occurred previously in the formula. Three options are available: `hybrid`; `differencing`; `eigenmethods`. The `hybrid` option is the most general and uses the relationships between the projection operators for the terms in the `formula` to decide which `projector`s to substract and which to orthogonalize using eigenmethods. The `differencing` option subtracts, from the current `projector`, those previously orthogonalized `projector`s for terms whose factors are a subset of the current `projector`'s factors. The `eigenmethods` option recursively orthogonalizes the `projector`s using an eigenanalysis of each `projector` with previously orthogonalized `projector`s.
`labels`	A `character` nominating the type of labels to be used in labelling the projectors, and which will be used also in the output tables, such the tables of the aliasing in the structure. The two alternatives are `terms` and `sources`. Terms have all factors/variables in it separated by colons (`:`). Sources have factors/variables in them that represent interactions separated by hashes (`#`); if some factors are nested within others, the nesting factors are surrounded by square brackets (`[` and `]`) and separated by colons (`:`). If some generalized, or combined, factors have no marginal terms, the constituent factors are separated by colons (`:`) and if they interact with other factors in the source they will be parenthesized.
`marginality`	A square `matrix` that can be used to supply the marginality `matrix` when it is desired to overwrite the calculated marginality `matrix` or when it is not being calculated. It should consist of zeroes and ones that gives the marginalites of the terms in the formula. It must have the row and column names set to the terms from the expanded `formula`, including being in the same order as these terms. The entry in the ith row and jth column will be one if the ith term is marginal to the jth term i.e. the column space of the ith term is a subspace of that for the jth term and so the source for the jth term is to be made orthogonal to that for the ith term. Otherwise, the entries are zero. A row and column should not be included for the grand mean even if `grandMean` is `TRUE`.
`check.marginality`	A `logical` indicating whether the marginality matrix, when it is supplied, is to be checked against that computed by `pstructure.formula`. It is ignored when `orthogonalize` is set to `eigenmethods`.
`omit.projectors`	A `logical`, which, if `TRUE`, results in the `projector`s in the `Q` of the `pstructure.object` being replaced by their degrees of freedom. These will be the degrees of freedom of the sources. This option is included a device for saving storage when the `projector`s are not required for further analysis.
`which.criteria`	A character `vector` nominating the efficiency criteria to be included in the summary of aliasing between terms. It can be `none`, `all` or some combination of `aefficiency`, `mefficiency`, `sefficiency`, `eefficiency`, `xefficiency`, `order` and `dforthog` – for details see `efficiency.criteria`. If `none`, no summary is printed.
`aliasing.print`	A `logical` indicating whether the aliasing between sources within the structure is to be printed.
`data`	A data frame contains the values of the factors and variables that occur in `formula`.
`...`	further arguments passed to `terms`.

Details

Firstly, the primary projector \mathbf{X(X'X)^-X'}, where X is the design matrix for the term, is calculated for each term. Then each projector is made orthogonal to terms aliased with it using porthogonalize.list, either by differencing, eigenmethods or the default hybrid method.

Differencing relies on comparing the factors involved in two terms, one previous to the other, to identify whether to subtract the orthogonalized projector for the previous term from the primary projector of the other. It does so if factors/variables for the previous term are a subset of the factors/variablesfor for the other term. This relies on ensuring that all projectors whose factors/variables are a subset of the current projector occur before it in the expanded formula. It is checked that the set of matrices are mutually orthogonal. If they are not then a warning is given. It may happen that differencing does not produce a projector, in which case eigenmethods must be used.

Eigenmethods forces each projector to be orthogonal to all terms previous to it in the expanded formula. It uses equation 4.10 of James and Wilkinson (1971), which involves calculating the canonical efficiency factors for pairs of primary projectors. It produces a table of efficiency criteria for partially aliased terms. Again, the order of terms is crucial. This method has the disadvantage that the marginality of terms is not determined and so sources names are set to be the same as the term names, unless a marginality matrix is supplied.

The hybrid method is the most general and uses the relationships between the projection operators for the terms in the formula to decide which projectors to subtract and which to orthogonalize using eigenmethods. If \mathbf{Q}_i and \mathbf{Q}_j are two projectors for two different terms, with i < j, then

if \mathbf{Q}_j\mathbf{Q}_i \neq \mathbf{0} then have to orthogonalize \mathbf{Q}_j to \mathbf{Q}_i.
if \mathbf{Q}_j\mathbf{Q}_i = \mathbf{Q}_j then, if \mathbf{Q}_i = \mathbf{Q}_j, they are equal and \mathbf{Q}_j will be removed from the list of terms; otherwise they are marginal and \mathbf{Q}_i is subtracted from \mathbf{Q}_j.
if have to orthogonalize and \mathbf{Q}_j\mathbf{Q}_i = \mathbf{Q}_i then \mathbf{Q}_j is aliased with previous terms and will be removed from the list of terms; otherwise \mathbf{Q}_i is partially aliased with \mathbf{Q}_j and \mathbf{Q}_j is orthogonalized to \mathbf{Q}_i using eigenmethods.

The order of terms is crucial in this process.

Of the three methods, eigenmethods is least likely to fail, but it does not establish the marginality between the terms. It is often needed when there is nonorthogonality between terms, such as when there are several linear covariates. It can also be more efficeint in these circumstances.

The process can be computationally expensive, particularly for a large data set (500 or more observations) and/or when many terms are to be orthogonalized.

If the error Matrix is not idempotent should occur then, especially if there are many terms, one might try using set.daeTolerance to reduce the tolerance used in determining if values are either the same or are zero; it may be necessary to lower the tolerance to as low as 0.001. Also, setting orthogonalize to eigenmethods is worth a try.

Value

A pstructure.object.

Author(s)

Chris Brien

References

James, A. T. and Wilkinson, G. N. (1971) Factorization of the residual operator and canonical decomposition of nonorthogonal factors in the analysis of variance. Biometrika, 58, 279-294.

Examples

## PBIBD(2) from p. 379 of Cochran and Cox (1957) Experimental Designs. 
## 2nd edn Wiley, New York
PBIBD2.unit <- list(Block = 6, Unit = 4)
PBIBD2.nest <- list(Unit = "Block")
trt <- factor(c(1,4,2,5, 2,5,3,6, 3,6,1,4, 4,1,5,2, 5,2,6,3, 6,3,4,1))
PBIBD2.lay <- designRandomize(allocated = trt, 
                              recipient = PBIBD2.unit, 
                              nested.recipients = PBIBD2.nest)
## manually obtain projectors for units
Q.G <- projector(matrix(1, nrow=24, ncol=24)/24)                         
Q.B <- projector(fac.meanop(PBIBD2.lay$Block) - Q.G)
Q.BP <- projector(diag(1, nrow=24) - Q.B - Q.G)

## manually obtain projector for trt
Q.T <- projector(fac.meanop(PBIBD2.lay$trt) - Q.G)

##compute intrablock efficiency criteria
effic <- proj2.efficiency(Q.BP, Q.T)
effic
efficiency.criteria(effic)

##obtain projectors using pstructure.formula
unit.struct <- pstructure(~ Block/Unit, data = PBIBD2.lay)
trt.struct <- pstructure(~ trt, data = PBIBD2.lay)

##obtain combined decomposition and summarize
unit.trt.p2canon <- projs.2canon(unit.struct$Q, trt.struct$Q)
summary(unit.trt.p2canon, which = c("aeff","eeff","order"))

[Package dae version 3.2.28 Index]

Takes a formula and constructs a pstructure.object that includes the orthogonalized projectors for the terms in a formula