R: Using supplied predictions and standard errors of pairwise...

allDifferences.data.frame {asremlPlus}

R Documentation

Using supplied predictions and standard errors of pairwise differences or the variance matrix of predictions, forms all pairwise differences between the set of predictions, and p-values for the differences.

Description

Uses supplied predictions and standard errors of pairwise differences, or the variance matrix of predictions to form, in an alldiffs.object, for those components not already present, (i) a table of all pairwise differences of the predictions, (ii) the p-value of each pairwise difference, and (iii) the minimum, mean, maximum and accuracy of LSD values. Predictions that are aliased (or inestimable) are removed from the predictions component of the alldiffs.object and standard errors of differences involving them are removed from the sed component.

If necessary, the order of the columns of the variables in the predictions component are changed to be the initial columns of the predictions.frame and to match their order in the classify. Also, the rows of predictions component are ordered so that they are in standard order for the variables in the classify. That is, the values of the last variable change with every row, those of the second-last variable only change after all the values of the last variable have been traversed; in general, the values of a variable are the same for all the combinations of the values to the variables to its right in the classify. The sortFactor or sortOrder arguments can be used to order of the values for the classify variables, which is achieved using sort.alldiffs.

Each p-value is computed as the probability of a t-statistic as large as or larger than the absolute value of the observed difference divided by its standard error. The p-values are stored in the p.differences component. The degrees of freedom of the t-distribution is the degrees of freedom stored in the tdf attribute of the alldiffs.object. This t-distribution is also used in calculating the LSD statistics stored in the LSD component of the alldiffs.object.

Usage

## S3 method for class 'data.frame'
allDifferences(predictions, classify, vcov = NULL, 
               differences = NULL, p.differences = NULL, sed = NULL, 
               LSD = NULL, LSDtype = "overall", LSDsupplied = NULL, 
               LSDby = NULL, LSDstatistic = "mean", 
               LSDaccuracy = "maxAbsDeviation", 
               retain.zeroLSDs = FALSE, 
               zero.tolerance = .Machine$double.eps ^ 0.5, 
               backtransforms = NULL, 
               response = NULL, response.title = NULL, 
               term = NULL, tdf = NULL,  
               x.num = NULL, x.fac = NULL,  
               level.length = NA, 
               pairwise = TRUE, alpha = 0.05,
               transform.power = 1, offset = 0, scale = 1, 
               transform.function = "identity", 
               inestimable.rm = TRUE,
               sortFactor = NULL, sortParallelToCombo = NULL, 
               sortNestingFactor = NULL, sortOrder = NULL, 
               decreasing = FALSE, ...)

Arguments

`predictions`	A `predictions.frame`, or a `data.frame`, beginning with the variables classifying the predictions and also containing columns named `predicted.value`, `standard.error` and `est.status`; each row contains a single predicted value. It may also contain columns for the lower and upper limits of error intervals for the predictions. Note that the names `standard.error` and `est.status` have been changed to `std.error` and `status` in the `pvals` component produced by `asreml-R4`; if the new names are in the `data.frame` supplied to `predictions`, they will be returned to the previous names.
`classify`	A `character` string giving the variables that define the margins of the multiway table that has been predicted. Multiway tables are specified by forming an interaction type term from the classifying variables, that is, separating the variable names with the `:` operator.
`vcov`	A `matrix` containing the variance matrix of the predictions; it is used in computing the variance of linear transformations of the predictions.
`differences`	A `matrix` containing all pairwise differences between the predictions; it should have the same number of rows and columns as there are rows in `predictions`.
`p.differences`	A `matrix` containing p-values for all pairwise differences between the predictions; each p-value is computed as the probability of a t-statistic as large as or larger than the observed difference divided by its standard error. The degrees of freedom of the t distribution for computing it are computed as the denominator degrees of freedom of the F value for the fixed term, if available; otherwise, the degrees of freedom stored in the attribute `tdf` are used; the matrix should be of the same size as that for `differences`.
`sed`	A `matrix` containing the standard errors of all pairwise differences between the predictions; they are used in computing the p-values.
`LSD`	An `LSD.frame` containing the mean, minimum and maximum LSD for determining the significance of pairwise differences, as well as an assigned LSD and a measure of the accuracy of the LSD. If `LSD` is `NULL` then the `LSD.frame` stored in the `LSD` component will be calculated and the values of `LSDtype`, `LSDby` and `LSDstatistic` added as attributes of the `alldiffs.object`. The LSD for a single prediction assumes that any predictions to be compared are independent; this is not the case if residual errors are correlated.
`LSDtype`	A `character` string that can be `overall`, `factor.combinations`, `per.prediction` or `supplied`. It determines whether the values stored in a row of a `LSD.frame` are the values calculated (i) `overall` from the LSD values for all pairwise comparison2, (ii) the values calculated from the pairwise LSDs for the levels of each `factor.combination`, unless there is only one prediction for a level of the `factor.combination`, when a notional LSD is calculated, (iii) `per.prediction`, being based, for each prediction, on all pairwise differences involving that prediction, or (iv) as `supplied` values of the LSD, specified with the `LSDsupplied` argument; these supplied values are to be placed in the `assignedLSD` column of the `LSD.frame` stored in an `alldiffs.object` so that they can be used in LSD calculations. See `LSD.frame` for further information on the values in a row of this `data.frame` and how they are calculated.
`LSDsupplied`	A `data.frame` or a named `numeric` containing a set of `LSD` values that correspond to the observed combinations of the values of the `LSDby` variables in the `predictions.frame` or a single LSD value that is an overall LSD. If a `data.frame`, it may have (i) a column for the `LSDby` variable and a column of `LSD` values or (ii) a single column of `LSD` values with rownames being the combinations of the observed values of the `LSDby` variables. Any name can be used for the column of `LSD` values; `assignedLSD` is sensible, but not obligatory. Otherwise, a `numeric` containing the `LSD` values, each of which is named for the observed combination of the values of the `LSDby` variables to which it corresponds. (Applying the `function` `dae::fac.combine` to the `predictions` component is one way of forming the required combinations for the (row) names.) The values supplied will be incorporated into `assignedLSD` column of the `LSD.frame` stored as the `LSD` component of the `alldiffs.object`.
`LSDby`	A `character` (vector) of variables names, being the names of the `factors` or `numerics` in the `classify`; for each combination of their levels and values, there will be or is a row in the `LSD.frame` stored in the `LSD` component of the `alldiffs.object` when `LSDtype` is `factor.combinatons`.
`LSDstatistic`	A `character` nominating one or more of `minimum`, `q10`, `q25`, `mean`, `median`, `q75`, `q90` or `maximum` as the value(s) to be stored in the `assignedLSD` column in an `LSD.frame`; the values in the `assignedLSD` column are used in computing `halfLeastSignificant` `error.intervals`. Here `q10`, `q25`, `q75` and `q90` indicate the sample quantiles corresponding to probabilities of 0.1, 0.25, 0.75 and 0.9 for the group of LSDs from which a single LSD value is calculated. The function `quantile` is used to obtain them. The `mean` LSD is calculated as the square root of the mean of the squares of the LSDs for the group. The `median` is calculated using the `median` function. Multiple values are only produced for `LSDtype` set to `factor.combination`, in which case `LSDby` must not be `NULL` and the number of values must equal the number of observed combinations of the values of the variables specified by `LSDby`. If `LSDstatistic` is `NULL`, it is reset to `mean`.
`LSDaccuracy`	A `character` nominating one of `maxAbsDeviation`, `maxDeviation`, `q90Deviation` or `RootMeanSqDeviation` as the statistic to be calculated as a measure of the accuracy of `assignedLSD`. The option `q90Deviation` produces the sample quantile corresponding to a probability of 0.90. The deviations are the differences between the LSDs used in calculating the LSD statistics and each assigned LSD and the accuracy is expressed as a proportion of the assigned LSD value. The calculated values are stored in the column named `accuracyLSD` in an `LSD.frame`.
`retain.zeroLSDs`	A `logical` indicating whether to retain or omit LSDs that are zero when calculating the summaries of LSDs.
`zero.tolerance`	A `numeric` specifying the value such that if an LSD is less than it, the LSD will be considered to be zero.
`backtransforms`	A `data.frame` containing the backtransformed values of the predicted values that is consistent with the `predictions` component, except that the column named `predicted.value` is replaced by one called `backtransformed.predictions`. Any `error.interval` values will also be the backtransformed values. Each row contains a single predicted value.
`response`	A `character` specifying the response variable for the predictions. It is stored as an attribute to the `alldiffs.object`.
`response.title`	A `character` specifying the title for the response variable for the predictions. It is stored as an attribute to the `alldiffs.object`.
`term`	A `character` string giving the variables that define the term that was fitted using `asreml` and that corresponds to `classify`. It only needs to be specified when it is different to `classify`; it is stored as an attribute of the `alldiffs.object`. It is likely to be needed when the fitted model includes terms that involve both a `numeric` covariate and a `factor` that parallel each other; the `classify` would include the covariate and the `term` would include the `factor`.
`tdf`	an `integer` specifying the degrees of freedom of the standard error. It is used as the degrees of freedom for the t-distribution on which p-values and confidence intervals are based. It is stored as an attribute to the `alldiffs.object`.
`x.num`	A `character` string giving the name of the numeric covariate that (i) is potentially included in terms in the fitted model and (ii) is the x-axis variable for plots. Its values will not be converted to a `factor`.
`x.fac`	A `character` string giving the name of the factor that (i) corresponds to `x.num` and (ii) is potentially included in terms in the fitted model. It should have the same number of levels as the number of unique values in `x.num`. The levels of `x.fac` must be in the order in which they are to be plotted - if they are dates, then they should be in the form yyyymmdd, which can be achieved using `as.Date`. However, the levels can be non-numeric in nature, provided that `x.num` is also set.
`level.length`	The maximum number of characters from the levels of factors to use in the row and column labels of the tables of pairwise differences and their p-values and standard errors.
`pairwise`	A logical indicating whether all pairwise differences of the `predictions` and their standard errors and p-values are to be computed and stored. If `FALSE`, the components `differences` and `p.differences` will be `NULL` in the returned `alldiffs.object`.
`alpha`	A `numeric` giving the significance level for LSDs or one minus the confidence level for confidence intervals. It is stored as an attribute to the `alldiffs.object`.
`transform.power`	A `numeric` specifying the power of a transformation, if one has been applied to the response variable. Unless it is equal to 1, the default, back-transforms of the predictions will be obtained and presented in tables or graphs as appropriate. The back-transformation raises the predictions to the power equal to the reciprocal of `transform.power`, unless it equals 0 in which case the exponential of the predictions is taken.
`offset`	A `numeric` that has been added to each value of the response after any scaling and before applying any power transformation.
`scale`	A `numeric` by which each value of the response has been multiplied before adding any offset and applying any power transformation.
`transform.function`	A `character` giving the name of a function that specifies the scale on which the predicted values are defined. This may be the result of a transformation of the data using the function or the use of the function as a link function in the fitting of a generalized linear (mixed) model (GL(M)M). The possible `transform.function`s are `identity`, `log`, `inverse`, `sqrt`, `logit`, `probit`, and `cloglog`. The `predicted.values` and `error.intervals`, if not `StandardError` intervals, will be back-transformed using the inverse function of the `transform.function`. The `standard.error` column will be set to `NA`, unless (i) `asreml` returns columns named `transformed.value` and `approx.se`, as well as those called `predicted.values` and `standard.error` (such as when a GLM is fitted) and (ii) the values in `transformed.value` are equal to those obtained by backtransforming the `predicted.value`s using the inverse function of the `transform.function`. Then, the `approx.se` values will be saved in the `standard.error` column of the `backtransforms` component of the returned `alldiffs.obj`. Also, the `transformed.value` and `approx.se` columns are removed from both the `predictions` and `backtransforms` components of the `alldiffs.obj`. Note that the values that end up in the `standard errors` column are approximate for the backtransformed values and are not used in calculating `error.intervals`.
`inestimable.rm`	A `logical` indicating whether rows for predictions that are not estimable are to be removed from the components of the `alldiffs.object`.
`sortFactor`	A `character` containing the name of the `factor` that indexes the set of predicted values that determines the sorting of the components. If there is only one variable in the `classify` term then `sortFactor` can be `NULL` and the order is defined by the complete set of predicted values. If there is more than one variable in the `classify` term then `sortFactor` must be set. In this case the `sortFactor` is sorted in the same order within each combination of the values of the `sortParallelToCombo` variables: the `classify` variables, excluding the `sortFactor`. There should be only one predicted value for each unique value of `sortFactor` within each set defined by a combination of the values of the `classify` variables, excluding the `sortFactor` `factor`. The order to use is determined by either `sortParallelToCombo` or `sortOrder`.
`sortParallelToCombo`	A `list` that specifies a combination of the values of the `factor`s and `numeric`s, excluding `sortFactor`, that are in `classify`. Each of the components of the supplied `list` is named for a `classify` variable and specifies a single value for it. The combination of this set of values will be used to define a subset of the predicted values whose order will define the order of `sortFactor`. Each of the other combinations of the values of the `factor`s and `numeric`s will be sorted in parallel. If `sortParallelToCombo` is `NULL` then the first value of each `classify` variable, except for the `sortFactor` `factor`, in the `predictions` component is used to define `sortParallelToCombo`. If there is only one variable in the `classify` then `sortParallelToCombo` is ignored.
`sortNestingFactor`	A `character` containing the name of the `factor` that defines groups of the `sortFactor` within which the predicted values are to be ordered. If there is only one variable in the `classify` then `sortNestingFactor` is ignored.
`sortOrder`	A `character vector` whose length is the same as the number of levels for `sortFactor` in the `predictions` component of the `alldiffs.object`. It specifies the desired order of the levels in the reordered components of the `alldiffs.object`. The argument `sortParallelToCombo` is ignored. The following creates a `sortOrder` vector `levs` for factor `f` based on the values in `x`: `levs <- levels(f)[order(x)]`.
`decreasing`	A `logical` passed to `order` that detemines whether the order for sorting the components of the `alldiffs.object` is for increasing or decreasing magnitude of the predicted values.
`...`	provision for passsing arguments to functions called internally - not used at present.

Value

An alldiffs.object with components predictions, vcov, differences, p.differences sed, and LSD.

The name of the response, the response.title, the term, the classify, tdf, alpha, sortFactor and the sortOrder will be set as attributes to the object. Note that the classify in an alldiffs.object is based on the variables indexing the predictions, which may differ from the classify used to obtain the original predictions (for example, when the alldiffs.objects stores a linear transformation of predictions.

Also, see predictPlus.asreml for more information.

Author(s)

Chris Brien

Examples

  data(Oats.dat)
  
  ## Use asreml to get predictions and associated statistics

  ## Not run: 
  m1.asr <- asreml(Yield ~ Nitrogen*Variety, 
                   random=~Blocks/Wplots,
                   data=Oats.dat)
  current.asrt <- as.asrtests(m1.asr)
  Var.pred <- asreml::predict.asreml(m1.asr, classify="Nitrogen:Variety", 
                                      sed=TRUE)
  if (getASRemlVersionLoaded(nchar = 1) == "3")
    Var.pred <- Var.pred$predictions
  Var.preds <- Var.pred$pvals
  Var.sed <- Var.pred$sed
  Var.vcov <- NULL
  wald.tab <-  current.asrt$wald.tab
  den.df <- wald.tab[match("Variety", rownames(wald.tab)), "denDF"]
  
## End(Not run)

  ## Use lmerTest and emmmeans to get predictions and associated statistics
  if (requireNamespace("lmerTest", quietly = TRUE) & 
      requireNamespace("emmeans", quietly = TRUE))
  {
    m1.lmer <- lmerTest::lmer(Yield ~ Nitrogen*Variety + (1|Blocks/Wplots),
                              data=Oats.dat)
    Var.emm <- emmeans::emmeans(m1.lmer, specs = ~ Nitrogen:Variety)
    Var.preds <- summary(Var.emm)
    den.df <- min(Var.preds$df)
    ## Modify Var.preds to be compatible with a predictions.frame
    Var.preds <- as.predictions.frame(Var.preds, predictions = "emmean", 
                                      se = "SE", interval.type = "CI", 
                                      interval.names = c("lower.CL", "upper.CL"))
    Var.vcov <- vcov(Var.emm)
    Var.sed <- NULL
  }

  ## Use the predictions obtained with either asreml or lmerTest
  if (exists("Var.preds"))
  {
    ## Order the Varieties in decreasing order for the predictions values in the 
    ## first N level 
    Var.diffs <- allDifferences(predictions = Var.preds, 
                                classify = "Nitrogen:Variety", 
                                sed = Var.sed, vcov = Var.vcov, tdf = den.df,
                                sortFactor = "Variety", decreasing = TRUE)
    print.alldiffs(Var.diffs, which="differences")
  
    ## Change the order of the factors in the alldiffs object and reorder components
    Var.reord.diffs <- allDifferences(predictions = Var.preds,
                                classify = "Variety:Nitrogen", 
                                sed = Var.sed, vcov = Var.vcov, tdf = den.df)
    print.alldiffs(Var.reord.diffs, which="predictions")
  }

[Package asremlPlus version 4.4.35 Index]