compute_score {JDCruncheR} | R Documentation |
Score calculation
Description
To calculate a score for each series from a quality report
Usage
## S3 method for class 'QR_matrix'
compute_score(
x,
score_pond = c(qs_residual_sa_on_sa = 30, f_residual_sa_on_sa = 30, qs_residual_sa_on_i
= 20, f_residual_sa_on_i = 20, f_residual_td_on_sa = 30, f_residual_td_on_i = 20,
oos_mean = 15, oos_mse = 10, residuals_independency = 15, residuals_homoskedasticity
= 5, residuals_skewness = 5, m7 = 5, q_m2 = 5),
modalities = c("Good", "Uncertain", "", "Bad", "Severe"),
normalize_score_value,
na.rm = FALSE,
n_contrib_score,
conditional_indicator,
...
)
Arguments
x |
a |
score_pond |
the formula used to calculate the series score. |
modalities |
modalities ordered by importance in the score calculation (cf. details). |
normalize_score_value |
integer indicating the reference value for weights normalisation. If missing, weights will not be normalised. |
na.rm |
logical indicating whether missing values must be ignored when calculating the score. |
n_contrib_score |
integer indicating the number of variables to create in the quality report's values matrix
to store the |
conditional_indicator |
a |
... |
other unused parameters. |
Details
The function compute_score
calculates a score from the modalities of a quality report: to each modality corresponds
a weight that depends on the parameter modalities
. The default parameter is c("Good", "Uncertain", "Bad","Severe")
,
and the associated weights are respectively 0, 1, 2 and 3.
The score calculation is based on the score_pond
parameter, which is a named integer vector containing the weights
to apply to the (modalities matrix) variables. For example, with score_pond = c(qs_residual_sa_on_sa = 10, f_residual_td_on_sa = 5)
,
the score will be based on the variables qs_residual_sa_on_sa and f_residual_td_on_sa.
The qs_residual_sa_on_sa grades will be multiplied by 10 and the f_residual_td_on_sa grades, by 5.
To ignore the missing values when calculating a score, use the parameter na.rm = TRUE
.
The parameter normalize_score_value
can be used to normalise the scores. For example, to have all scores between 0 and 20,
specify normalize_score_value = 20
.
When using parameter n_contrib_score
, n_contrib_score
new variables are added to the quality report's values matrix.
These new variables store the names of the variables that contribute the most to the series score.
For example, n_contrib_score = 3
will add to the values matrix the three variables that contribute the most to the score.
The new variables' names are i_highest_score, with i being the rank in terms of contribution to the score (1_highest_score
contains the name of the greatest contributor, 2_highest_score the second greatest, etc).
Only the variables that have a non-zero contribution to the score are taken into account: if a series score is 0,
all i_highest_score variables will be empty. And if a series score is positive only because of the m7 statistic,
1_highest_score will have a value of "m7" for this series and the other i_highest_score will be empty.
Some indicators are only relevant under certain conditions. For example, the homoscedasticity test is only valid when the residuals are independant, and the normality tests,
only when the residuals are both independant and homoscedastic. In these cases, the parameter conditional_indicator
can be of use
since it reduces the weight of some variables down to 1 when some conditions are met.
conditional_indicator
is a list
of 3-elements sub-lists:
"indicator": the variable whose weight will be conditionally changed
"conditions": the variables used to define the conditions
"conditions_modalities": modalities that must be verified to induce the weight change For example,
conditional_indicator = list(list(indicator = "residuals_skewness", conditions = c("residuals_independency", "residuals_homoskedasticity"), conditions_modalities = c("Bad","Severe")))
, reduces down to 1 the weight of the variable "residuals_skewness" when the modalities of the independancy test ("residuals_independency") or the homoscedasticity test ("residuals_homoskedasticity") are "Bad" or "Severe".
Value
a QR_matrix
or mQR_matrix
object.
See Also
Examples
# Path of matrix demetra_m
demetra_path <- file.path(
system.file("extdata", package = "JDCruncheR"),
"WS/ws_ipi/Output/SAProcessing-1",
"demetra_m.csv"
)
# Extract the quality report from the demetra_m file
QR <- extract_QR(demetra_path)
# Compute the score
QR <- compute_score(QR, n_contrib_score = 2)
print(QR)
# Extract the modalities matrix:
QR$modalities$score