codacore {codacore}R Documentation

codacore

Description

This function implements the codacore algorithm described by Gordon-Rodriguez et al. 2021 (https://doi.org/10.1101/2021.02.11.430695).

Usage

codacore(
  x,
  y,
  logRatioType = "balances",
  objective = NULL,
  lambda = 1,
  offset = NULL,
  shrinkage = 1,
  maxBaseLearners = 5,
  optParams = list(),
  cvParams = list(),
  verbose = FALSE,
  overlap = TRUE,
  fast = TRUE
)

Arguments

x

A data.frame or matrix of the compositional predictor variables.

y

A data.frame, matrix or vector of the response.

logRatioType

A string indicating whether to use "balances" or "amalgamations". Also accepts "balance", "B", "ILR", or "amalgam", "A", "SLR". Note that the current implementation for balances is not strictly an ILR, but rather just a collection of balances (which are possibly non-orthogonal in the Aitchison sense).

objective

A string indicating "binary classification" or "regression". By default, it is NULL and gets inferred from the values in y.

lambda

A numeric. Corresponds to the "lambda-SE" rule. Sets the "regularization strength" used by the algorithm to decide how to harden the ratio. Larger numbers tend to yield fewer, more sparse ratios.

offset

A numeric vector of the same length as y. Works similarly to the offset in a glm.

shrinkage

A numeric. Shrinkage factor applied to each base learner. Defaults to 1.0, i.e., no shrinkage applied.

maxBaseLearners

An integer. The maximum number of log-ratios that the model will learn before stopping. Automatic stopping based on seRule may occur sooner.

optParams

A list of named parameters for the optimization of the continuous relaxation. Empty by default. User can override as few or as many of our defaults as desired. Includes adaptiveLR (learning rate under adaptive training scheme), momentum (in the gradient-descent sense), epochs (number of gradient-descent epochs), batchSize (number of observations per minibatch, by default the entire dataset), and vanillaLR (the learning rate to be used if the user does *not* want to use the 'adaptiveLR', to be used at the risk of optimization issues).

cvParams

A list of named parameters for the "hardening" procedure using cross-validation. Includes numFolds (number of folds, default=5) and maxCutoffs (number of candidate cutoff values of 'c' to be tested out during CV process, default=20 meaning log-ratios with up to 21 components can be found by codacore).

verbose

A boolean. Toggles whether to display intermediate steps.

overlap

A boolean. Toggles whether successive log-ratios found by CoDaCoRe may contain repeated input variables. TRUE by default. Changing to FALSE implies that the log-ratios obtained by CoDaCoRe will become orthogonal in the Aitchison sense, analogously to the isometric-log-ratio transformation, while losing a small amount of model flexibility.

fast

A boolean. Whether to run in fast or slow mode. TRUE by default. Running in slow mode will take ~x5 the computation time, but may help identify slightly more accurate log-ratios.

Value

A codacore object.

Examples

## Not run: 
data("Crohn")
x <- Crohn[, -ncol(Crohn)]
y <- Crohn[, ncol(Crohn)]
x <- x + 1
model = codacore(x, y)
print(model)
plot(model)

## End(Not run)


[Package codacore version 0.0.4 Index]