| score_simple.cluster_pairs {reclin2} | R Documentation |
Score pairs based on a number of comparison vectors
Description
Score pairs based on a number of comparison vectors
Usage
## S3 method for class 'cluster_pairs'
score_simple(
pairs,
variable,
on,
w1 = 1,
w0 = 0,
wna = 0,
new_name = NULL,
...
)
score_simple(pairs, variable, on, w1 = 1, w0 = 0, wna = 0, ...)
## S3 method for class 'pairs'
score_simple(
pairs,
variable,
on,
w1 = 1,
w0 = 0,
wna = 0,
inplace = FALSE,
...
)
Arguments
pairs |
a |
variable |
the name of the new variable to create in pairs. This will be a
logical variable with a value of |
on |
character vector of variables on which the score should be based. |
w1 |
a vector or list with weights for agreement for each of the
variables. It can either be a numeric vector of length 1 in which case the
same weight is used for all variables; A numeric vector of length equal to
the length of |
w0 |
a vector or list with weights for non-agreement for each of the
variables. See details for more information. For the format see |
wna |
a vector or list with weights for agreement for each of the
variables. See details for more information. For the format see |
new_name |
name of new object to assign the pairs to on the cluster nodes. |
... |
ignored |
inplace |
logical indicating whether |
Details
The individual contribution of a variable x to the total score is
given by x * w1 + (1-x) * w0 in case of non-NA values and
wna in case of NA. This assumes that the values 1 corresponds
to complete agreement and the value 0 to complete non-agreement. In case of
complete agreement a variable contributes w1 to the total score and in
case of complete non-agreement it contributes w0 to the total score.
Value
Returns the data.table pairs with the column variable added in
case of score_simple.pairs.
In case of score_simple.cluster_pairs, score_simple.pairs is called on
each cluster node and the resulting pairs are assigned to new_name in
the environment reclin_env. When new_name is not given (or
equal to NULL) the original pairs on the nodes are overwritten.
Examples
data("linkexample1", "linkexample2")
pairs <- pair_blocking(linkexample1, linkexample2, "postcode")
compare_pairs(pairs, on = c("firstname", "lastname", "sex"), inplace = TRUE)
score_simple(pairs, "score", on = c("firstname", "lastname", "sex"))
# Change the default weights
score_simple(pairs, "score", on = c("firstname", "lastname", "sex"),
w1 = 2, w0 = -1, wna = NA)
# Use a named vector; omited elements from w1 get a weight of 1; those from
# w0 and wna a weight of 0.
score_simple(pairs, "score", on = c("firstname", "lastname", "sex"),
w1 = c("firstname" = 2, "lastname" = 3),
w0 = c("firstname" = -1, "lastname" = -0.5))
# Use a named list; omited elements from w1 get a weight of 1; those from
# w0 and wna a weight of 0.
score_simple(pairs, "score", on = c("firstname", "lastname", "sex"),
w1 = list("firstname" = 2, "lastname" = 3),
w0 = list("firstname" = -1, "lastname" = -0.5))