date-time-parse {clock} | R Documentation |
Parsing: date-time
Description
There are four parsers for parsing strings into POSIXct date-times,
date_time_parse()
, date_time_parse_complete()
,
date_time_parse_abbrev()
, and date_time_parse_RFC_3339()
.
date_time_parse()
date_time_parse()
is useful for strings like "2019-01-01 00:00:00"
, where
the UTC offset and full time zone name are not present in the string. The
string is first parsed as a naive-time without any time zone assumptions, and
is then converted to a POSIXct with the supplied zone
.
Because converting from naive-time to POSIXct may result in nonexistent or
ambiguous times due to daylight saving time, these must be resolved
explicitly with the nonexistent
and ambiguous
arguments.
date_time_parse()
completely ignores the %z
and %Z
commands. The only
time zone specific information that is used is the zone
.
The default format
used is "%Y-%m-%d %H:%M:%S"
. This matches the default
result from calling format()
on a POSIXct date-time.
date_time_parse_complete()
date_time_parse_complete()
is a parser for complete date-time strings,
like "2019-01-01T00:00:00-05:00[America/New_York]"
. A complete date-time
string has both the time zone offset and full time zone name in the string,
which is the only way for the string itself to contain all of the information
required to unambiguously construct a zoned-time. Because of this,
date_time_parse_complete()
requires both the %z
and %Z
commands to be
supplied in the format
string.
The default format
used is "%Y-%m-%dT%H:%M:%S%Ez[%Z]"
. This matches the
default result from calling date_format()
on a POSIXct date-time.
Additionally, this format matches the de-facto standard extension to RFC 3339
for creating completely unambiguous date-times.
date_time_parse_abbrev()
date_time_parse_abbrev()
is a parser for date-time strings containing only
a time zone abbreviation, like "2019-01-01 00:00:00 EST"
. The time zone
abbreviation is not enough to identify the full time zone name that the
date-time belongs to, so the full time zone name must be supplied as the
zone
argument. However, the time zone abbreviation can help with resolving
ambiguity around daylight saving time fallbacks.
For date_time_parse_abbrev()
, %Z
must be supplied and is interpreted as
the time zone abbreviation rather than the full time zone name.
If used, the %z
command must parse correctly, but its value will be
completely ignored.
The default format
used is "%Y-%m-%d %H:%M:%S %Z"
. This matches the
default result from calling print()
or format(usetz = TRUE)
on a POSIXct
date-time.
date_time_parse_RFC_3339()
date_time_parse_RFC_3339()
is a parser for date-time strings in the
extremely common date-time format outlined by RFC 3339. This document outlines
a profile of the ISO 8601 format that is even more restrictive, but
corresponds to the most common formats that are likely to be used in
internet protocols (i.e. through APIs).
In particular, this function is intended to parse the following three formats:
2019-01-01T00:00:00Z 2019-01-01T00:00:00+0430 2019-01-01T00:00:00+04:30
This function defaults to parsing the first of these formats by using
a format string of "%Y-%m-%dT%H:%M:%SZ"
.
If your date-time strings use offsets from UTC rather than "Z"
, then set
offset
to one of the following:
-
"%z"
if the offset is of the form"+0430"
. -
"%Ez"
if the offset is of the form"+04:30"
.
The RFC 3339 standard allows for replacing the "T"
with a "t"
or a space
(" "
). Set separator
to adjust this as needed.
The date-times returned by this function will always be in the UTC time zone.
Usage
date_time_parse(
x,
zone,
...,
format = NULL,
locale = clock_locale(),
nonexistent = NULL,
ambiguous = NULL
)
date_time_parse_complete(x, ..., format = NULL, locale = clock_locale())
date_time_parse_abbrev(x, zone, ..., format = NULL, locale = clock_locale())
date_time_parse_RFC_3339(x, ..., separator = "T", offset = "Z")
Arguments
x |
A character vector to parse. |
zone |
A full time zone name. |
... |
These dots are for future extensions and must be empty. |
format |
A format string. A combination of the following commands, or A vector of multiple format strings can be supplied. They will be tried in the order they are provided. Year
Month
Day
Day of the week
ISO 8601 week-based year
Week of the year
Day of the year
Date
Time of day
Time zone
Miscellaneous
|
locale |
A locale object created from |
nonexistent |
One of the following nonexistent time resolution strategies, allowed to be either length 1, or the same length as the input:
Using either If If |
ambiguous |
One of the following ambiguous time resolution strategies, allowed to be either length 1, or the same length as the input:
Alternatively, Finally, If If |
separator |
The separator between the date and time components of the string. One of:
|
offset |
The format of the offset from UTC contained in the string. One of:
|
Details
If date_time_parse_complete()
is given input that is length zero, all
NA
s, or completely fails to parse, then no time zone will be able to be
determined. In that case, the result will use "UTC"
.
If you have strings with sub-second components, then these date-time parsers
are not appropriate for you. Remember that clock treats POSIXct as a second
precision type, so parsing a string with fractional seconds directly into a
POSIXct is ambiguous and undefined. Instead, fully parse the string,
including its fractional seconds, into a clock type that can handle it, such
as a naive-time with naive_time_parse()
, then round to seconds with
whatever rounding convention is appropriate for your use case, such as
time_point_floor()
, and finally convert that to POSIXct with
as_date_time()
. This gives you complete control over how the fractional
seconds are handled when converting to POSIXct.
Value
A POSIXct.
Examples
# Parse with a known `zone`, even though that information isn't in the string
date_time_parse("2020-01-01 05:06:07", "America/New_York")
# Same time as above, except this is a completely unambiguous parse that
# doesn't require a `zone` argument, because the zone name and offset are
# both present in the string
date_time_parse_complete("2020-01-01T05:06:07-05:00[America/New_York]")
# Only day components
date_time_parse("2020-01-01", "America/New_York", format = "%Y-%m-%d")
# `date_time_parse()` may have issues with ambiguous times due to daylight
# saving time fallbacks. For example, there were two 1'oclock hours here:
x <- date_time_parse("1970-10-25 00:59:59", "America/New_York")
# First (earliest) 1'oclock hour
add_seconds(x, 1)
# Second (latest) 1'oclock hour
add_seconds(x, 3601)
# If you try to parse this ambiguous time directly, you'll get an error:
ambiguous_time <- "1970-10-25 01:00:00"
try(date_time_parse(ambiguous_time, "America/New_York"))
# Resolve it by specifying whether you'd like to use the
# `earliest` or `latest` of the two possible times
date_time_parse(ambiguous_time, "America/New_York", ambiguous = "earliest")
date_time_parse(ambiguous_time, "America/New_York", ambiguous = "latest")
# `date_time_parse_complete()` doesn't have these issues, as it requires
# that the offset and zone name are both in the string, which resolves
# the ambiguity
complete_times <- c(
"1970-10-25T01:00:00-04:00[America/New_York]",
"1970-10-25T01:00:00-05:00[America/New_York]"
)
date_time_parse_complete(complete_times)
# `date_time_parse_abbrev()` also doesn't have these issues, since it
# uses the time zone abbreviation name to resolve the ambiguity
abbrev_times <- c(
"1970-10-25 01:00:00 EDT",
"1970-10-25 01:00:00 EST"
)
date_time_parse_abbrev(abbrev_times, "America/New_York")
# ---------------------------------------------------------------------------
# RFC 3339
# Typical UTC format
x <- "2019-01-01T00:01:02Z"
date_time_parse_RFC_3339(x)
# With a UTC offset containing a `:`
x <- "2019-01-01T00:01:02+02:30"
date_time_parse_RFC_3339(x, offset = "%Ez")
# With a space between the date and time and no `:` in the offset
x <- "2019-01-01 00:01:02+0230"
date_time_parse_RFC_3339(x, separator = " ", offset = "%z")
# ---------------------------------------------------------------------------
# Sub-second components
# If you have a string with sub-second components, but only require up to
# seconds, first parse them into a clock type that can handle sub-seconds to
# fully capture that information, then round using whatever convention is
# required for your use case before converting to a date-time.
x <- c("2019-01-01T00:00:01.1", "2019-01-01T00:00:01.78")
x <- naive_time_parse(x, precision = "millisecond")
x
time_point_floor(x, "second")
time_point_round(x, "second")
as_date_time(time_point_round(x, "second"), "America/New_York")