perGeno {factReg} | R Documentation |
Genomic prediction using glmnet, with a genotype-specific penalized regression model.
Description
.... These models can be fitted either for the original data, or on the residuals of a model with only main effects.
Usage
perGeno(
dat,
Y,
G,
E,
indices = NULL,
indicesData = NULL,
testEnv = NULL,
weight = NULL,
useRes = TRUE,
outputFile = NULL,
corType = c("pearson", "spearman"),
partition = data.frame(),
nfolds = 10,
alpha = 1,
scaling = c("train", "all", "no"),
quadratic = FALSE,
verbose = FALSE
)
Arguments
dat |
A |
Y |
The trait to be analyzed: either of type character, in which case
it should be one of the column names in |
G |
The column in |
E |
The column in |
indices |
The columns in |
indicesData |
An optional |
testEnv |
vector (character). Data from these environments are not used
for fitting the model. Accuracy is evaluated for training and test
environments separately. The default is |
weight |
Numeric vector of length |
useRes |
Indicates whether the genotype-specific regressions are to be
fitted on the residuals of a model with main effects. If |
outputFile |
The file name of the output files, without .csv extension
which is added by the function. If not |
corType |
type of correlation: Pearson (default) or Spearman rank sum. |
partition |
|
nfolds |
Default |
alpha |
Type of penalty, as in glmnet (1 = LASSO, 0 = ridge; in between = elastic net). Default is 1. |
scaling |
determines how the environmental variables are scaled. "train" : all data (test and training environments) are scaled using the mean and and standard deviation in the training environments. "all" : using the mean and standard deviation of all environments. "no" : No scaling. |
quadratic |
boolean; default |
verbose |
boolean; default |
Value
A list with the following elements:
- predTrain
Vector with predictions for the training set (to do: Add the factors genotype and environment; make a data.frame)
- predTest
Vector with predictions for the test set (to do: Add the factors genotype and environment; make a data.frame). To do: add estimated environmental main effects, not only predicted environmental main effects
- mu
the estimated overall (grand) mean
- envInfoTrain
The estimated environmental main effects, and the predicted effects, obtained when the former are regressed on the averaged indices, using penalized regression.
- envInfoTest
The predicted environmental main effects for the test environments, obtained from penalized regression using the estimated main effects for the training environments and the averaged indices.
- parGeno
data.frame containing the estimated genotypic main effects (first column) and sensitivities (subsequent columns)
- testAccuracyEnv
a
data.frame
with the accuracy (r) for each test environment- trainAccuracyEnv
a
data.frame
with the accuracy (r) for each training environment- trainAccuracyGeno
a
data.frame
with the accuracy (r) for each genotype, averaged over the training environments- testAccuracyGeno
a
data.frame
with the accuracy (r) for each genotype, averaged over the test environments- RMSEtrain
The root mean squared error on the training environments
- RMSEtest
The root mean squared error on the test environments
- Y
The name of the trait that was predicted, i.e. the column name in dat that was used
- G
The genotype label that was used, i.e. the argument G that was used
- E
The environment label that was used, i.e. the argument E that was used
- indices
The indices that were used, i.e. the argument indices that was used
- lambdaOpt
- pargeno
- quadratic
The quadratic option that was used