general_clean_directory_plain {podcleaner}R Documentation

Mutate operation(s) in Scottish post office general directory data.frame column(s)

Description

Attempts to clean the provided Scottish post office general directory data.frame.

Usage

general_clean_directory_plain(directory, verbose)

Arguments

directory

A Scottish post office general directory in the form of a data.frame or other object that inherits from the data.frame class such as a tibble. Columns must at least include forename, surname, occupation and addresses.

verbose

Whether the function should be executed silently (FALSE) or not (TRUE).

Value

A data.frame of the same class as the one provided in directory; columns include at least forename, surname, occupation, address.trade.number, address.trade.body, address.house.number and address.house.body. "house" suffix in occupation column is move to addresses, occupation information is repatriated from addresses to occupation column; addresses is split into trade and house address column; additional records are created for each extra trade address identified. Entries are further cleaned of optical character recognition (OCR) errors and subject to a number of standardisation operations.


[Package podcleaner version 0.1.2 Index]