fix_dates {datefixR} | R Documentation |
Clean up messy date columns
Description
Cleans up a dataframe
object which has date columns
entered via a free-text box (possibly by different users) and are therefore
in a non-standardized format. Supports numerous separators including /,-, or
space. Supports all-numeric, abbreviation, or long-hand month notation. Where
day of the month has not been supplied, the first day of the month is
imputed. Either DMY or YMD is assumed by default. However, the US system of
MDY is supported via the format
argument.
Usage
fix_dates(
df,
col.names,
day.impute = 1,
month.impute = 7,
id = NULL,
format = "dmy"
)
Arguments
df |
A |
col.names |
Character vector of names of columns of messy date data |
day.impute |
Integer. Day of the month to be imputed if not available.
defaults to 1. If |
month.impute |
Integer. Month to be be imputed if not available.
Defaults to 7 (July). If |
id |
Name of column containing row IDs. By default, the first column is assumed. |
format |
Character. The format which a date is mostly likely to be given
in. Either |
Value
A dataframe
or tibble
object. Dependent on the type of
df
. Selected columns are of type Date
See Also
fix_date
Similar to fix_dates()
except can only
be applied to character objects.
Examples
bad.dates <- data.frame(
id = seq(5),
some.dates = c(
"02/05/92",
"01-04-2020",
"1996/05/01",
"2020-05-01",
"02-04-96"
),
some.more.dates = c(
"2015",
"02/05/00",
"05/1990",
"2012-08",
"jan 2020"
)
)
fixed.df <- fix_dates(bad.dates, c("some.dates", "some.more.dates"))
# ->
fixed.df <- fix_date_df(bad.dates, c("some.dates", "some.more.dates"))