R: Function to find the optimal RANKS score thereshold

find.optimal.thresh.cv {RANKS}

R Documentation

Function to find the optimal RANKS score thereshold

Description

Function to find the optimal quantile alpha and corresponding threshold by cross-validation with a kernel-based score method.

Usage

find.optimal.thresh.cv(K, ind.pos, ind.non.pos, m = 5, 
alpha = seq(from = 0.05, to = 0.6, by = 0.05), init.seed = NULL, 
opt.fun = compute.F, fun = KNN.score, ...)

Arguments

`K`	matrix. Kernel matrix or any valid symmetric matrix
`ind.pos`	indices of the positive examples. They are the indices the row of RW corresponding to positive examples of the training set.
`ind.non.pos`	indices of the non positive examples. They are the indices the row of RW corresponding to non positive examples of the training set.
`m`	number of folds (default: 5)
`alpha`	vector of the quantiles to be tested
`init.seed`	initial seed for the random generator. If NULL (def) no initialization is performed
`opt.fun`	Function implementing the metric to select the optimal threshold. The F-score (compute.F) is the default. Available functions: - compute.F: F-score (default) - compute.acc:accuracy. Any function having two arguments representing the vector of predicted and true labels can be in principle used.
`fun`	function. It must be a kernel-based score method (default KNN.score)
`...`	optional arguments for the function fun

Details

Function to find the optimal quantile alpha and corresponding threshold by cross-validation with a kernel-based score method. The optimality is computed with respect to a specific metric (def: F-score). This function is used by multiple.ker.score.thresh.cv, ker.score.classifier.holdout, ker.score.classifier.cv.

Value

A list with 3 elements:

`alpha`	quantile corresponding to the best F-score
`thresh`	threshold corresponding to the best F-score
`pos.scores`	scores of the positive elements computed through CV

Examples

# Finding the optimal threshold in the Tanimoto chemical structure similarity network 
# between 1253 DrugBank drugs for the prediction of the DrugBank category Penicillins using
# the KNN-score with the random walk kernel 
library(bionetdata);
data(DD.chem.data);
data(DrugBank.Cat);
K <- rw.kernel(DD.chem.data);
labels <- DrugBank.Cat[,"Penicillins"];
ind.pos <- which(labels==1);
ind.non.pos <- which(labels==0);
res <- find.optimal.thresh.cv(K, ind.pos, ind.non.pos);
res

[Package RANKS version 1.1 Index]