clean_admin_names {epiCleanr} | R Documentation |
Clean and Match Administrative Names
Description
This function takes administrative names and cleans them using various matching and string distance algorithms. It can also match the cleaned names with a base list provided by the user or fetched from 'GeoNames', which is a official repository of standard spellings of all foreign geographic names.
Usage
clean_admin_names(
admin_names_to_clean,
country_code,
admin_level = "adm2",
user_base_admin_names = NULL,
user_base_only = FALSE,
report_mode = FALSE
)
Arguments
admin_names_to_clean |
A character vector of administrative names to clean. |
country_code |
sed if 'use_get_admin_names' is TRUE. A character string
or numerical value of the country code (e.g., "KE"). This can be
in various formats such as country name, ISO codes, UN codes, etc.,
see |
admin_level |
A character string indicating the administrative level (e.g., "adm2"). |
user_base_admin_names |
A character of of administrative names that the use would like to use as reference. This is no necessary, downloaded 'GeoNames' will be used if missing. |
user_base_only |
A logical indicating whether to use only the user-provided base administrative names ('user_base_admin_names') for matching. If TRUE, 'country_code' and 'admin_names_to_clean' are not required. Default is FALSE. |
report_mode |
A logical indicating whether to return a detailed report. Default is FALSE. |
Value
If 'report_mode' is set to TRUE, a data frame containing the original admin names and the matched and cleaned admin names with inormation of the source of data used to clean including the algorithm used, else a cleaned list of names is returned.
See Also
countrycode::codelist()
for the full list of codes and naming conventions.
Examples
# Example with country code
base_names <- c(
"Paris", "Marseille", "Lyon",
"Toulouse", "Nice", "Nantes", "Strasbourg",
"Montpellier", "Bordeaux", "Lille"
)
unclean_names <- c(
"Pariis", "Marseill", "Lyone",
"Toulous", "Niice", "Nantees", "Strasbourgh",
"Montpeelier", "Bordeuax", "Lilie"
)
france_new <- clean_admin_names(
country_code = "Fr",
user_base_admin_names = base_names,
admin_names_to_clean = unclean_names
)
print(france_new)