split_locations {weed} | R Documentation |
Splits string of manually entered locations into one row for each location
Description
Changes the unit of analysis from a disaster, to a disaster-location. This is useful as preprocessing before geocoding each disaster-location pair.
Can be used in piped operations, making it tidy!
Usage
split_locations(
.,
column_name = "locations",
dummy_words = c("cities", "states", "provinces", "districts", "municipalities",
"regions", "villages", "city", "state", "province", "district", "municipality",
"region", "township", "village", "near", "department"),
joiner_regex = ",|\\(|\\)|;|\\+|( and )|( of )"
)
Arguments
. |
data frame of disaster data |
column_name |
name of the column containing the locations |
dummy_words |
a vector of words that we don't want in our final output. |
joiner_regex |
a regex that tells us how to split the locations |
Value
same data frame with the location_word column added as well as a column called uncertain_location_specificity where the same location could be referred to in varying levels of specificity
Examples
locs <- c("city of new york", "kerala, chennai municipality, and san francisco",
"mumbai region, district of seattle, sichuan province")
d <- tibble::as_tibble(locs)
split_locations(d, column_name = "value")
[Package weed version 1.1.2 Index]