mmdata {precrec} | R Documentation |
Reformat input data for performance evaluation calculation
Description
The mmdata
function takes predicted scores and labels
and returns an mdat
object. The evalmod
function
takes an mdat
object as input data to calculate evaluation measures.
Usage
mmdata(
scores,
labels,
modnames = NULL,
dsids = NULL,
posclass = NULL,
na_worst = TRUE,
ties_method = "equiv",
expd_first = NULL,
mode = "rocprc",
nfold_df = NULL,
score_cols = NULL,
lab_col = NULL,
fold_col = NULL,
...
)
Arguments
scores |
A numeric dataset of predicted scores. It can be a vector,
a matrix, an array, a data frame, or a list. The |
labels |
A numeric, character, logical, or factor dataset
of observed labels. It can be a vector, a matrix, an array,
a data frame, or a list. The |
modnames |
A character vector for the names of the models.
The |
dsids |
A numeric vector for test dataset IDs.
The |
posclass |
A scalar value to specify the label of positives
in |
na_worst |
A Boolean value for controlling the treatment of NAs
in
|
ties_method |
A string for controlling ties in
|
expd_first |
A string to indicate which of the two variables - model names or test dataset IDs should be expanded first when they are automatically generated.
|
mode |
A string that specifies the types of evaluation measures
that the
|
nfold_df |
A data frame that contains at least one score column, label and fold columns. |
score_cols |
A character/numeric vector that specifies score columns
of |
lab_col |
A number/string that specifies the label column
of |
fold_col |
A number/string that specifies the fold column
of |
... |
Not used by this method. |
Value
The mmdata
function returns an mdat
object
that contains formatted labels and score ranks. The object can
be used as input data for the evalmod
function.
See Also
evalmod
for calculation evaluation measures.
join_scores
and join_labels
for formatting
scores and labels with multiple datasets.
format_nfold
for creating n-fold cross validation dataset
from data frame.
Examples
##################################################
### Single model & single test dataset
###
## Load a dataset with 10 positives and 10 negatives
data(P10N10)
## Generate mdat object
ssmdat1 <- mmdata(P10N10$scores, P10N10$labels)
ssmdat1
ssmdat2 <- mmdata(1:8, sample(c(0, 1), 8, replace = TRUE))
ssmdat2
##################################################
### Multiple models & single test dataset
###
## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(1, 100, 100, "all")
## Multiple models & single test dataset
msmdat1 <- mmdata(samps[["scores"]], samps[["labels"]],
modnames = samps[["modnames"]]
)
msmdat1
## Use join_scores and join_labels
s1 <- c(1, 2, 3, 4)
s2 <- c(5, 6, 7, 8)
scores <- join_scores(s1, s2)
l1 <- c(1, 0, 1, 1)
l2 <- c(1, 0, 1, 1)
labels <- join_labels(l1, l2)
msmdat2 <- mmdata(scores, labels, modnames = c("ms1", "ms2"))
msmdat2
##################################################
### Single model & multiple test datasets
###
## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(10, 100, 100, "good_er")
## Single model & multiple test datasets
smmdat <- mmdata(samps[["scores"]], samps[["labels"]],
modnames = samps[["modnames"]],
dsids = samps[["dsids"]]
)
smmdat
##################################################
### Multiple models & multiple test datasets
###
## Create sample datasets with 100 positives and 100 negatives
samps <- create_sim_samples(10, 100, 100, "all")
## Multiple models & multiple test datasets
mmmdat <- mmdata(samps[["scores"]], samps[["labels"]],
modnames = samps[["modnames"]],
dsids = samps[["dsids"]]
)
mmmdat
##################################################
### N-fold cross validation datasets
###
## Load test data
data(M2N50F5)
head(M2N50F5)
## Speficy nessesary columns to create mdat
cvdat1 <- mmdata(
nfold_df = M2N50F5, score_cols = c(1, 2),
lab_col = 3, fold_col = 4,
modnames = c("m1", "m2"), dsids = 1:5
)
cvdat1
## Use column names
cvdat2 <- mmdata(
nfold_df = M2N50F5, score_cols = c("score1", "score2"),
lab_col = "label", fold_col = "fold",
modnames = c("m1", "m2"), dsids = 1:5
)
cvdat2