R: Cross-validation procedure

cv_ammi {metan}

R Documentation

Cross-validation procedure

Description

Cross-validation for estimation of AMMI models

THe original dataset is split into two datasets: training set and validation set. The 'training' set has all combinations (genotype x environment) with N-1 replications. The 'validation' set has the remaining replication. The splitting of the dataset into modeling and validation sets depends on the design informed. For Completely Randomized Block Design (default), and alpha-lattice design (declaring block arguments), complete replicates are selected within environments. The remained replicate serves as validation data. If design = 'RCD' is informed, completely randomly samples are made for each genotype-by-environment combination (Olivoto et al. 2019). The estimated values considering naxis-Interaction Principal Component Axis are compared with the 'validation' data. The Root Mean Square Prediction Difference (RMSPD) is computed. At the end of boots, a list is returned.

IMPORTANT: If the data set is unbalanced (i.e., any genotype missing in any environment) the function will return an error. An error is also observed if any combination of genotype-environment has a different number of replications than observed in the trial.

Usage

cv_ammi(
  .data,
  env,
  gen,
  rep,
  resp,
  block = NULL,
  naxis = 2,
  nboot = 200,
  design = "RCBD",
  verbose = TRUE
)

Arguments

`.data`	The dataset containing the columns related to Environments, Genotypes, replication/block and response variable(s).
`env`	The name of the column that contains the levels of the environments.
`gen`	The name of the column that contains the levels of the genotypes.
`rep`	The name of the column that contains the levels of the replications/blocks. AT LEAST THREE REPLICATES ARE REQUIRED TO PERFORM THE CROSS-VALIDATION.
`resp`	The response variable.
`block`	Defaults to `NULL`. In this case, a randomized complete block design is considered. If block is informed, then a resolvable alpha-lattice design (Patterson and Williams, 1976) is employed. All effects, except the error, are assumed to be fixed.
`naxis`	The number of axis to be considered for estimation of GE effects.
`nboot`	The number of resamples to be used in the cross-validation. Defaults to 200.
`design`	The experimental design. Defaults to `RCBD` (Randomized complete Block Design). For Completely Randomized Designs inform `design = 'CRD'`.
`verbose`	A logical argument to define if a progress bar is shown. Default is `TRUE`.

Value

An object of class cv_ammi with the following items: * RMSPD: A vector with nboot-estimates of the Root Mean Squared Prediction Difference between predicted and validating data.

RMSPDmean: The mean of RMSPDmean estimates.
Estimated: A data frame that contain the values (predicted, observed, validation) of the last loop.
Modeling: The dataset used as modeling data in the last loop
Testing: The dataset used as testing data in the last loop.

Author(s)

Tiago Olivoto tiagoolivoto@gmail.com

References

Olivoto, T., A.D.C. L\'ucio, J.A.G. da silva, V.S. Marchioro, V.Q. de Souza, and E. Jost. 2019. Mean performance and stability in multi-environment trials I: Combining features of AMMI and BLUP techniques. Agron. J. 111:2949-2960. doi:10.2134/agronj2019.03.0220

Patterson, H.D., and E.R. Williams. 1976. A new class of resolvable incomplete block designs. Biometrika 63:83-92.

Examples



library(metan)
model <- cv_ammi(data_ge,
                env = ENV,
                gen = GEN,
                rep = REP,
                resp = GY,
                nboot = 5,
                naxis = 2)

[Package metan version 1.18.0 Index]