build_tier {fedmatch} | R Documentation |
Build settings for a tier
Description
build_tier_settings
is a convenient way to make the proper list for the
tier_list
argument of tier_match
Each vector in build_score_settings
should be the same length, and each position (first, second, third, etc.)
corresponds to one variable to score on.
Usage
build_tier(
by.x = NULL,
by.y = NULL,
check_merge = NULL,
match_type = NULL,
fuzzy_settings = build_fuzzy_settings(),
score_settings = NULL,
filter = NULL,
filter.args = NULL,
evaluate = NULL,
evaluate.args = NULL,
clean_settings = build_clean_settings(),
clean = NULL,
sequential_words = NULL,
allow.cartesian = FALSE,
multivar_settings = build_multivar_settings()
)
Arguments
by.x |
character string. Variable to merge on in data1. See |
by.y |
character string. Variable to merge on in data2. See |
check_merge |
logical. Checks that your unique_keys are indeed unique. |
match_type |
string. If 'exact', match is exact, if 'fuzzy', match is
fuzzy. If 'multivar,' match is multivar-based. See |
fuzzy_settings |
additional arguments for amatch, to be used if match_type = 'fuzzy'. Suggested defaults provided. (see amatch, method='jw') |
score_settings |
list. Score settings for post-hoc matchscores. |
filter |
function or numeric. Filters a merged data1-data2 dataset. If a function, should take in a data.frame (data1 and data2 merged by name1 and name2) and spit out a trimmed version of the data.frame (fewer rows). Think of this function as applying other conditions to matches, other than a match by name. The first argument of filter should be the data.frame. If numeric, will drop all observations with a matchscore lower than or equal to filter. |
filter.args |
list. Arguments passed to filter, if a function |
evaluate |
Function to evaluate merge_plus output. |
evaluate.args |
list. Arguments passed to evaluate |
clean_settings |
list. Settings for string cleaning. See |
clean |
Boolean, T/F, whether or not to clean strings prior to the match. |
sequential_words |
data.table of words in the same format of the common_words argument in |
allow.cartesian |
whether or not to allow many-many matches, see data.table::merge() |
multivar_settings |
list of settings to go to the multivar match if match_type
== 'multivar'. See |
Value
a list containing 1 tier for the 'tier_list' argument of tier_match
.