general_clean_directory {podcleaner} | R Documentation |
Mutate operation(s) in Scottish post office general directory data.frame column(s)
Description
Attempts to clean the provided Scottish post office general directory data.frame.
Usage
general_clean_directory(directory, progress = TRUE, verbose = FALSE)
Arguments
directory |
A Scottish post office general directory in the form
of a data.frame or other object that inherits from the data.frame class
such as a |
progress |
Whether progress should be shown ( |
verbose |
Whether the function should be executed silently ( |
Value
A tibble
; columns include at least
forename
, surname
, occupation
, address.trade.number
,
address.trade.body
, address.house.number
and address.house.body
.
"house" suffix in occupation
column is move to addresses
, occupation
information is repatriated from addresses
to occupation
column;
addresses
is split into trade and house address columns; additional
records are created for each extra trade address identified. Entries are
further cleaned of optical character recognition (OCR) errors and subject
to a number of standardisation operations.
Examples
pages <- rep("71", 2L)
surnames <- c("ABOT", "ABRCROMBIE")
forenames <- c("Wm.", "Alex")
occupations <- c("Wine and spirit mercht - See Advertisement in Appendix.", "")
addresses = c(
"1S20 Londn rd; ho. 13<J Queun sq",
"Bkr; I2 Dixon Street, & 29 Auderstn Qu.; res 2G5 Argul st."
)
directory <- tibble::tibble(
page = pages, surname = surnames, forename = forenames,
occupation = occupations, addresses = addresses
)
general_clean_directory(directory, progress = TRUE, verbose = FALSE)