R: Forms the predictions for each of one or more terms and...

predictPresent.asreml {asremlPlus}

R Documentation

Forms the predictions for each of one or more terms and presents them in tables and/or graphs.

Description

This function forms the predictions for each term in terms using a supplied asreml object and predictPlus.asreml. Tables are produced using predictPlus.asreml, in conjunction with
allDifferences.data.frame, with the argument tables specifying which tables are printed. The argument plots, along with transform.power, controls which plots are produced. The plots are produced using plotPredictions.data.frame, with line plots produced when variables involving x.num or x.fac are involved in classify for the predictions and bar charts otherwise. In order to get the correct predictions you may need to supply additional arguments to predict.asreml through ... e.g. present, parallel, levels.

The order of plotting the levels of one of the factors indexing the predictions can be modified and is achieved using sort.alldiffs.

Usage

## S3 method for class 'asreml'
predictPresent(asreml.obj, terms, inestimable.rm = TRUE, 
               linear.transformation = NULL, EGLS.linTransform, 
               error.intervals = "Confidence", alpha = 0.05, 
               wald.tab = NULL, dDF.na = "residual", dDF.values = NULL, 
               pairwise = TRUE, Vmatrix = FALSE, 
               avsed.tolerance = 0.25, accuracy.threshold = NA, 
               LSDtype = "overall", LSDsupplied = NULL, LSDby = NULL, 
               LSDstatistic = "mean", LSDaccuracy = "maxAbsDeviation", 
               x.num = NULL, x.fac = NULL, nonx.fac.order = NULL,  
               x.pred.values = NULL, x.plot.values = NULL, 
               plots = "predictions", panels = "multiple", 
               graphics.device = NULL, interval.annotate = TRUE,
               titles = NULL, colour.scheme = "colour", save.plots = FALSE, 
               transform.power = 1, offset = 0, scale = 1, 
               transform.function = "identity", 
               tables = "all", level.length = NA, 
               sortFactor = NULL, sortParallelToCombo = NULL, 
               sortNestingFactor = NULL, sortOrder = NULL, 
               decreasing = FALSE, 
               trace = FALSE, ggplotFuncs = NULL, ...)

Arguments

`asreml.obj`	`asreml` object for a fitted model.
`terms`	A `character vector` giving the terms for which predictions are required.
`inestimable.rm`	A `logical` indicating whether rows for predictions that are not estimable are to be removed from the components of the `alldiffs.object`.
`linear.transformation`	A `formula` or a `matrix`. If a `formula` is given then it is taken to be a submodel of a model term corresponding to the `classify`. The projection matrix that transforms the `predictions` so that they conform to the submodel is obtained; the submodel does not have to involve variables in the `classify`, but the variables must be columns in the `predictions` component of `alldiffs.obj` and the space for the submodel must be a subspace of the space for the term specified by the `classify`. For example, for `classify` set to `"A:B"`, the submodel `~ A + B` will result in the `predictions` for the combinations of `A` and `B` being made additive for the `factors` `A` and `B`. The submodel space corresponding to `A + B` is a subspace of the space `A:B`. In this case both the submodel and the classify involve only the factors A and B. To fit an intercept-only submodel, specify `linear.transformation` to be the formula `~1`. If a `matrix` is provided then it will be used to apply the linear transformation to the `predictions`. It might be a contrast `matrix` or a `matrix` of weights for a factor used to obtain the weighted average over that factor. The number of rows in the `matrix` should equal the number of linear combinations of the `predictions` desired and the number of columns should equal the number of `predictions`. In either case, as well as the values of the linear combinations, their standard errors, pairwise differences and associated statistics are returned in the `alldiffs.object`.
`EGLS.linTransform`	A `logical` indicating whether or not the `linear.transformation` of the predictions stored in an `alldiffs.object` by fitting a submodel supplied in a `formula` is to take into account the variance of the predictions using a Estimated Generalized Least Squares (EGLS) approach. This is likely to be appropriate when the variance matrix of the predictions is not compound symmetric i.e. when not all the variances are equal or not all the covariances are equal. If the variance matrix is compund symmetric, then the setting of `EGLS.linTransform` will not affect the transformed predictions.
`error.intervals`	A `character` string indicating the type of error interval, if any, to calculate in order to indicate uncertainty in the results. Possible values are `"none"`, `"StandardError"`, `"Confidence"` and `"halfLeastSignificant"`. The default is for confidence limits to be used. The `"halfLeastSignificant"` option results in half the Least Significant Difference (LSD) being added and subtracted to the predictions, the LSD being calculated using the square root of the mean of the variances of all or a subset of pairwise differences between the predictions. If the LSD is zero, as can happen when predictions are constrained to be equal, then the limits of the error intervals are set to `NA`. If `LSDtype` is set to `overall`, the `avsed.tolerance` is not `NA` and the range of the SEDs divided by the average of the SEDs exceeds `avsed.tolerance` then the `error.intervals` calculations and the plotting will revert to confidence intervals.
`alpha`	A `numeric` giving the significance level for LSDs or one minus the confidence level for confidence intervals. It is stored as an attribute to the `alldiffs.object`.
`wald.tab`	A `data.frame` containing the pseudo-anova table for the fixed terms produced by a call to `wald.asreml`. The main use of it here is in determining the degrees of freedom for calculating confidence or half-LSD `error.intervals` and p-values, the latter to be stored in the `p.differences` component of the `alldiffs.object` that is created.
`dDF.na`	The method to use to obtain approximate denominator degrees of freedom. when the numeric or algebraic methods produce an `NA`. Consistent with when no denDF are available, the default is `"residual"` and so the residual degrees of freedom from `asreml.obj$nedf` are used. If `dDF.na = "none"`, no substitute denominator degrees of freedom are employed; if `dDF.na = "maximum"`, the maximum of those denDF that are available, excluding that for the Intercept, is used; if all denDF are `NA`, `asreml.obj$nedf` is used. If `dDF.na = "supplied"`, a `vector` of values for the denominator degrees of freedom is to be supplied in `dDF.values`. Any other setting is ignored and a warning message produced. Generally, substituting these degrees of freedom is anticonservative in that it is likely that the degrees of freedom used will be too large.
`dDF.values`	A `vector` of values to be used when `dDF.na = "supplied"`. Its values will be used when `denDF` in a test for a fixed effect is `NA`. This vector must be the same length as the number of fixed terms, including (Intercept) whose value could be `NA`.
`pairwise`	A logical indicating whether all pairwise differences of the `predictions` and their standard errors and p-values are to be computed and stored. If `tables` is equal to `"differences"` or `"all"` or `error.intervals` is equal to `"halfLeastSignificant"`, they will be stored irrespective of the value of `pairwise`.
`Vmatrix`	A `logical` indicating whether the variance matrix of the `predictions` will be stored as a component of the `alldiffs.object` that is returned. If `linear.transformation` is set, it will be stored irrespective of the value of `Vmatrix`.
`avsed.tolerance`	A `numeric` giving the value of the SED range, the range of the SEDs divided by the square root of the mean of the variances of all or a subset of the pairwise differences, that is considered reasonable in calculating `error.intervals`. It should be a value between 0 and 1. The following rules apply: If `avsed.tolerance` is `NA` then mean LSDs of the type specified by `LSDtype` are calculated and used in `error.intervals` and plots. Irrespective of the setting of `LSDtype`, if `avsed.tolerance` is not exceeded then the mean LSDs are used in `error.intervals` and plots. If `LSDtype` is set to `overall`, `avsed.tolerance` is not `NA`, and `avsed.tolerance` is exceeded then `error.intervals` and plotting revert to confidence intervals. If `LSDtype` is set to `factor.combinations` and `avsed.tolerance` is not exceeded for any factor combination then the half LSDs are used in `error.intervals` and plots; otherwise, `error.intervals` and plotting revert to confidence intervals. If `LSDtype` is set to `per.prediction` and `avsed.tolerance` is not exceeded for any prediction then the half LSDs are used in `error.intervals` and plots; otherwise, `error.intervals` and plotting revert to confidence intervals.
`accuracy.threshold`	A `numeric` specifying the value of the LSD accuracy measure, which measure is specified by `LSDaccuracy`, as a threshold value in determining whether the `hallfLeastSignificant` `error.interval` for a predicted value is a reasonable approximation; this will be the case if the LSDs across all pairwise comparisons for which the interval's LSD was computed, as specified by `LSDtype` and `LSDby`, are similar enough to the interval's LSD, as measured by `LSDaccuracy`. If it is `NA`, it will be ignored. If it is not `NA`, a column of `logicals` named `LSDwarning` will be added to the `predictions` component of the `alldiffs.object`. The value of `LSDwarning` for a `predicted.value` will be `TRUE` if the value of the `LSDaccuracy` measure computed from the LSDs for differences between this `predicted.value` and the other `predicted.values` as compared to its `assignedLSD` exceeds the value of `accuracy.threshold`. Otherwise, the value of `LSDwarning` for a `predicted.value` will be `FALSE`.
`LSDtype`	A `character` string that can be `overall`, `factor.combinations`, `per.prediction` or `supplied`. It determines whether the values stored in a row of a `LSD.frame` are the values calculated (i) `overall` from the LSD values for all pairwise comparison2, (ii) the values calculated from the pairwise LSDs for the levels of each `factor.combination`, unless there is only one prediction for a level of the `factor.combination`, when a notional LSD is calculated, (iii) `per.prediction`, being based, for each prediction, on all pairwise differences involving that prediction, or (iv) as `supplied` values of the LSD, specified with the `LSDsupplied` argument; these supplied values are to be placed in the `assignedLSD` column of the `LSD.frame` stored in an `alldiffs.object` so that they can be used in LSD calculations. See `LSD.frame` for further information on the values in a row of this `data.frame` and how they are calculated.
`LSDsupplied`	A `data.frame` or a named `numeric` containing a set of `LSD` values that correspond to the observed combinations of the values of the `LSDby` variables in the `predictions.frame` or a single LSD value that is an overall LSD. If a `data.frame`, it may have (i) a column for the `LSDby` variable and a column of `LSD` values or (ii) a single column of `LSD` values with rownames being the combinations of the observed values of the `LSDby` variables. Any name can be used for the column of `LSD` values; `assignedLSD` is sensible, but not obligatory. Otherwise, a `numeric` containing the `LSD` values, each of which is named for the observed combination of the values of the `LSDby` variables to which it corresponds. (Applying the `function` `dae::fac.combine` to the `predictions` component is one way of forming the required combinations for the (row) names.) The values supplied will be incorporated into `assignedLSD` column of the `LSD.frame` stored as the `LSD` component of the `alldiffs.object`.
`LSDby`	A `character` (vector) of variables names, being the names of the `factors` or `numerics` in the `classify`; for each combination of their levels and values, there will be or is a row in the `LSD.frame` stored in the `LSD` component of the `alldiffs.object` when `LSDtype` is `factor.combinatons`.
`LSDstatistic`	A `character` nominating one or more of `minimum`, `q10`, `q25`, `mean`, `median`, `q75`, `q90` or `maximum` as the value(s) to be stored in the `assignedLSD` column in an `LSD.frame`; the values in the `assignedLSD` column are used in computing `halfLeastSignificant` `error.intervals`. Here `q10`, `q25`, `q75` and `q90` indicate the sample quantiles corresponding to probabilities of 0.1, 0.25, 0.75 and 0.9 for the group of LSDs from which a single LSD value is calculated. The function `quantile` is used to obtain them. The `mean` LSD is calculated as the square root of the mean of the squares of the LSDs for the group. The `median` is calculated using the `median` function. Multiple values are only produced for `LSDtype` set to `factor.combination`, in which case `LSDby` must not be `NULL` and the number of values must equal the number of observed combinations of the values of the variables specified by `LSDby`. If `LSDstatistic` is `NULL`, it is reset to `mean`.
`LSDaccuracy`	A `character` nominating one of `maxAbsDeviation`, `maxDeviation`, `q90Deviation` or `RootMeanSqDeviation` as the statistic to be calculated as a measure of the accuracy of `assignedLSD`. The option `q90Deviation` produces the sample quantile corresponding to a probability of 0.90. The deviations are the differences between the LSDs used in calculating the LSD statistics and each assigned LSD and the accuracy is expressed as a proportion of the assigned LSD value. The calculated values are stored in the column named `accuracyLSD` in an `LSD.frame`.
`x.num`	A `character` string giving the name of the numeric covariate that (i) is potentially included in terms in the fitted model and (ii) is the x-axis variable for plots. Its values will not be converted to a `factor`.
`x.fac`	A `character` string giving the name of the factor that (i) corresponds to `x.num` and (ii) is potentially included in terms in the fitted model. It should have the same number of levels as the number of unique values in `x.num`. The levels of `x.fac` must be in the order in which they are to be plotted - if they are dates, then they should be in the form yyyymmdd, which can be achieved using `as.Date`. However, the levels can be non-numeric in nature, provided that `x.num` is also set.
`nonx.fac.order`	A `character vector` giving the order in which factors other than `x.fac` are to be plotted in plots with multiple panels (i.e. where the number of non-x factors is greater than 1). The first factor in the vector will be plotted on the X axis (if there is no `x.num` or `x.fac`. Otherwise, the order of plotting the factors is in columns (X facets) and then rows (Y facets). By default the order is in decreasing order for the numbers of levels of the non x factors.
`x.pred.values`	The values of `x.num` for which predicted values are required.
`x.plot.values`	The actual values to be plotted on the x axis or in the labels of tables. They are needed when values different to those in `x.num` are to be plotted or `x.fac` is to be plotted because there is no `x.num` term corresponding to the same term with `x.fac`.
`plots`	Possible values are `"none"`, `"predictions"`, `"backtransforms"` and `"both"`. Plots are not produced if the value is `"none"`. If data are not transformed for analysis (`transform.power` = 1), a plot of the predictions is produced provided `plots` is not `"none"`. If the data are transformed, the value of `plots` determines what is produced.
`panels`	Possible values are `"single"` and `"multiple"`. When line plots are to be produced, because variables involving `x.num` or `x.fac` are involved in `classify` for the predictions, `panels` determines whether or not a single panel or multiple panels in a single window are produced. The `panels` argument is ignored for bar charts.
`graphics.device`	A `character` specifying a graphics device for plotting. The default is `graphics.device = NULL`, which will result in plots being produced on the current graphics device. Setting it to `"windows"`, for example, will result in a windows graphics device being opened.
`interval.annotate`	A `logical` indicating whether the plot annotation indicating the type of `error.interval` is to be included in the plot.
`titles`	A `list`, each component of which is named for a column in the `data.frame` for `asreml.obj` and contains a `character string` giving a title to use in output (e.g. tables and graphs). Here they will be used for axis labels.
`colour.scheme`	A character string specifying the colour scheme for the plots. The default is `"colour"` which produces coloured lines and bars, a grey background and white gridlines. A value of `"black"` results in black lines, grey bars and gridlines and a white background.
`save.plots`	A `logical` that determines whether any plots will be saved. If they are to be saved, a file name will be generated that consists of the following elements separated by full stops: the response variable name with `.back` if backtransformed values are being plotted, the classify term, `Bar` or `Line` and, if `error.intervals` is not `"none"`, one of `SE`, `CI` or `LSI`. The file will be saved as a ‘png’ file in the current work directory.
`transform.power`	A `numeric` specifying the power of a transformation, if one has been applied to the response variable. Unless it is equal to 1, the default, back-transforms of the predictions will be obtained and stored in the `backtransforms` component of the `alldiffs.object`. The `plots` and `tables` arguments control the plotting and output of the `predictions` and `backtransforms`. The back-transformation raises the predictions to the power equal to the reciprocal of `transform.power`, unless it equals 0 in which case the exponential of the predictions is taken.
`offset`	A number that has been added to each value of the response after any scaling and before applying any power transformation. Unless it is equal to 0, the default, back-transforms of the predictions will be obtained and stored in the `backtransforms` component of the `alldiffs.object`. The `plots` and `tables` arguments control the plotting and output of the `predictions` and `backtransforms`. The backtransformation will, after backtransforming for any power transformation, subtract the `offset`.
`scale`	A number by which each value of the response has been multiply before adding any offset and applying any power transformation. Unless it is equal to 1, the default, back-transforms of the predictions will be obtained and stored in the `backtransforms` component of the `alldiffs.object`. The `plots` and `tables` arguments control the plotting and output of the `predictions` and `backtransforms`. The backtransformation will, after backtransforming for any power transformation and then subtracting the offset, divide by the `scale`.
`transform.function`	A `character` giving the name of a function that specifies the scale on which the predicted values are defined. This may be the result of a transformation of the data using the function or the use of the function as a link function in the fitting of a generalized linear (mixed) model (GL(M)M). The possible `transform.function`s are `identity`, `log`, `inverse`, `sqrt`, `logit`, `probit`, and `cloglog`. The `predicted.values` and `error.intervals`, if not `StandardError` intervals, will be back-transformed using the inverse function of the `transform.function`. The `standard.error` column will be set to `NA`, unless (i) `asreml` returns columns named `transformed.value` and `approx.se`, as well as those called `predicted.values` and `standard.error` (such as when a GLM is fitted) and (ii) the values in `transformed.value` are equal to those obtained by backtransforming the `predicted.value`s using the inverse function of the `transform.function`. Then, the `approx.se` values will be saved in the `standard.error` column of the `backtransforms` component of the returned `alldiffs.obj`. Also, the `transformed.value` and `approx.se` columns are removed from both the `predictions` and `backtransforms` components of the `alldiffs.obj`. Note that the values that end up in the `standard errors` column are approximate for the backtransformed values and are not used in calculating `error.intervals`.
`tables`	A `character vector` containing a combination of `predictions`, `vcov`, `backtransforms`, `differences`, `p.differences`, `sed`, `LSD` and `all`. These nominate which components of the `alldiffs.object` to print.
`level.length`	The maximum number of characters from the levels of factors to use in the row and column labels of the tables produced by `allDifferences.data.frame`.
`sortFactor`	A `character` containing the name of the `factor` that indexes the set of predicted values that determines the sorting of the components. If there is only one variable in the `classify` term then `sortFactor` can be `NULL` and the order is defined by the complete set of predicted values. If there is more than one variable in the `classify` term then `sortFactor` must be set. In this case the `sortFactor` is sorted in the same order within each combination of the values of the `sortParallelToCombo` variables: the `classify` variables, excluding the `sortFactor`. There should be only one predicted value for each unique value of `sortFactor` within each set defined by a combination of the values of the `classify` variables, excluding the `sortFactor` `factor`. The order to use is determined by either `sortParallelToCombo` or `sortOrder`.
`sortParallelToCombo`	A `list` that specifies a combination of the values of the `factor`s and `numeric`s, excluding `sortFactor`, that are in `classify`. Each of the components of the supplied `list` is named for a `classify` variable and specifies a single value for it. The combination of this set of values will be used to define a subset of the predicted values whose order will define the order of `sortFactor`. Each of the other combinations of the values of the `factor`s and `numeric`s will be sorted in parallel. If `sortParallelToCombo` is `NULL` then the first value of each `classify` variable, except for the `sortFactor` `factor`, in the `predictions` component is used to define `sortParallelToCombo`. If there is only one variable in the `classify` then `sortParallelToCombo` is ignored.
`sortNestingFactor`	A `character` containing the name of the `factor` that defines groups of the `sortFactor` within which the predicted values are to be ordered. If there is only one variable in the `classify` then `sortNestingFactor` is ignored.
`sortOrder`	A `character vector` whose length is the same as the number of levels for `sortFactor` in the `predictions` component of the `alldiffs.object`. It specifies the desired order of the levels in the reordered components of the `alldiffs.object`. The argument `sortParallelToCombo` is ignored. The following creates a `sortOrder` vector `levs` for factor `f` based on the values in `x`: `levs <- levels(f)[order(x)]`.
`decreasing`	A `logical` passed to `order` that detemines whether the order for sorting the components of the `alldiffs.object` is for increasing or decreasing magnitude of the predicted values.
`trace`	If TRUE then partial iteration details are displayed when ASReml-R functions are invoked; if FALSE then no output is displayed.
`ggplotFuncs`	A `list`, each element of which contains the results of evaluating a `ggplot2` function. It is created by calling the `list` function with a `ggplot2` function call for each element. It is passed to `plotPredictions.data.frame`.
`...`	further arguments passed to `predict.asreml` via `predictPlus.asreml` and to `ggplot` via `plotPredictions.data.frame`.

Value

A list containing an alldiffs.object for each term for which tables are produced. The names of the components of this list are the terms with full-stops (.) replacing colons (:). Plots are also preduced depending on the setting of the plot argument.

Author(s)

Chris Brien

Examples

## Not run: 
data(WaterRunoff.dat)
titles <- list("Days since first observation", "Days since first observation", 
               "pH", "Turbidity (NTU)")
names(titles) <- names(WaterRunoff.dat)[c(5,7,11:12)]
asreml.options(keep.order = TRUE) #required for asreml-R4 only
current.asr <- asreml(fixed = log.Turbidity ~ Benches + Sources + Type + Species + 
                                 Sources:Type + Sources:Species + Sources:Species:xDay + 
                                 Sources:Species:Date, 
                      data = WaterRunoff.dat, keep.order = TRUE)
current.asrt <- as.asrtests(current.asr, NULL, NULL)

#### Get the observed combinations of the factors and variables in classify
class.facs <- c("Sources","Species","Date","xDay")
levs <- as.data.frame(table(WaterRunoff.dat[class.facs]))
levs <- levs[do.call(order, levs), ]
levs <- as.list(levs[levs$Freq != 0, class.facs])
levs$xDay <- as.numfac(levs$xDay)
  
#### parallel and levels are arguments from predict.asreml
diff.list <- predictPresent.asreml(asreml.obj = current.asrt$asreml.obj, 
                                   terms = "Date:Sources:Species:xDay",
                                   x.num = "xDay", x.fac = "Date", 
                                   parallel = TRUE, levels = levs, 
                                   wald.tab = current.asrt$wald.tab, 
                                   plots = "predictions", 
                                   error.intervals = "StandardError", 
                                   titles = titles, 
                                   transform.power = 0, 
                                   present = c("Type","Species","Sources"), 
                                   tables = "none", 
                                   level.length = 6)

## End(Not run)

[Package asremlPlus version 4.4.35 Index]