timezones {base} | R Documentation |
Time Zones
Description
Information about time zones in R. Sys.timezone
returns
the name of the current time zone.
Usage
Sys.timezone(location = TRUE)
OlsonNames(tzdir = NULL)
Arguments
location |
logical. Defunct, with a warning if |
tzdir |
the time-zone database to be used: the default is to try known locations until one is found. |
Details
Time zones are a system-specific topic, but these days almost all R platforms use similar underlying code, used by Linux, macOS, Solaris, AIX and FreeBSD, and installed with R on Windows. (Unfortunately there are many system-specific errors in the implementations.) It is possible to use the R sources' version of the code on Unix-alikes as well as on Windows: this is the default on macOS.
It should be possible to set the current time zone via the environment
variable TZ: see the section on ‘Time zone names’ for
suitable values. Sys.timezone()
will return the value of
TZ if set initially (and on some OSes it is always set),
otherwise it will try to retrieve from the OS a value which if set for
TZ would give the initial time zone. (‘Initially’ means
before any time-zone functions are used: if TZ is being set to
override the OS setting or if the ‘try’ does not get this
right, it should be set before the R process is started or (probably
early enough) in file .Rprofile
).
If TZ is set but invalid, most platforms default to ‘UTC’,
the time zone colloquially known as ‘GMT’ (see
https://en.wikipedia.org/wiki/Coordinated_Universal_Time).
(Some but not all platforms will give a warning for invalid values.)
If it is unset or empty the system time zone is used (the one
returned by Sys.timezone
).
Time zones did not come into use until the middle of the nineteenth century and were not widely adopted until the twentieth, and daylight saving time (DST, also known as summer time) was first introduced in the early twentieth century, most widely in 1916. Over the last 100 years places have changed their affiliation between major time zones, have opted out of (or in to) DST in various years or adopted DST rule changes late or not at all. (For example, the UK experimented with DST throughout 1971, only.) In a few countries (one is the Irish Republic) it is the summer time which is the ‘standard’ time and a different name is used in winter. And there can be multiple changes during a year, for example for Ramadan.
A quite common system implementation of POSIXct
was as signed
32-bit integers and so only went back to the end of 1901: on such
systems R assumes that dates prior to that are in the same time zone
as they were in 1902. Most of the world had not adopted time zones by
1902 (so used local ‘mean time’ based on longitude) but for a
few places there had been time-zone changes before then. 64-bit
representations are becoming by far the most common; unfortunately on
some 64-bit OSes the database information is 32-bit and so only
available for the range 1901–2038, and incompletely for the end
years.
When a time zone location is first found in a session its value is
cached in object .sys.timezone
in the base environment.
Value
Sys.timezone
returns an OS-specific character string, possibly
NA
or an empty string (which on some OSes means ‘UTC’).
This will be a location such as "Europe/London"
if one can be
ascertained.
A time zone region may be known by several names: for example ‘"Europe/London"’ may also be known as ‘GB’, ‘GB-Eire’, ‘Europe/Belfast’, ‘Europe/Guernsey’, ‘Europe/Isle_of_Man’ and ‘Europe/Jersey’. A few regions are also known by a summary of their time zone, e.g. ‘PST8PDT’ is (on most but not all systems) an alias for ‘America/Los_Angeles’.
OlsonNames
returns a character vector, see the examples for
typical cases. It may have an attribute "Version"
, something
like ‘"2023a"’. (It does on systems using
--with-internal-tzcode and those like Fedora distributing
file ‘tzdata.zi’.)
Time zone names
Names "UTC"
and its synonym "GMT"
are accepted on all
platforms.
Where OSes describe their valid time zones can be obscure. The help
for the C function tzset
can be helpful, but it can also be
inaccurate. There is a cumbersome POSIX specification (listed under
environment variable TZ at
https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap08.html#tag_08),
which is often at least partially supported, but there are other more
user-friendly ways to specify time zones.
Almost all R platforms make use of a time-zone database originally
compiled by Arthur David Olson and now managed by IANA, in which the
preferred way to refer to a time zone is by a location (typically of a
city), e.g., Europe/London
, America/Los_Angeles
,
Pacific/Easter
within a ‘time zone region’. Some
traditional designations are also allowed such as EST5EDT
or
GB
. (Beware that some of these designations may not be what
you expect: in particular EST
is a time zone used in Canada
without daylight saving time, and not EST5EDT
nor
(Australian) Eastern Standard Time.) The designation can also be an
optional colon prepended to the path to a file giving complied zone
information (and the examples above are all files in a system-specific
location). See https://data.iana.org/time-zones/tz-link.html
for more details and references. By convention, regions with a unique
time-zone history since 1970 have specific names in the database, but
those with different earlier histories may not. Each time zone has
one or two (the second for ‘summer’) abbreviations used when
formatting times.
Increasingly OSes are (optionally or always) not including
‘legacy’ names such as US/Eastern
: only names of the
forms Continent/City
and Etc/...
are fully portable.
The abbreviations used have changed over the years: for example France used ‘PMT’ (‘Paris Mean Time’) from 1891 to 1911 then ‘WET/WEST’ up to 1940 and ‘CET/CEST’ from 1946. (In almost all time zones the abbreviations have been stable since 1970.) The POSIX standard allows only one or two abbreviations per time zone, so you may see the current abbreviation(s) used for older times.
For some time zones abbreviations are like ‘-03’ and
‘+0845’: this is done when there is no official abbreviation.
(Negative values are behind (West of) UTC, as for the "%z"
format for strftime
.)
The function OlsonNames
returns the time-zone names known to
the currently selected Olson/IANA database. The system-specific
location in the file system varies,
e.g. ‘/usr/share/zoneinfo’ (Linux, macOS, FreeBSD),
‘/usr/share/lib/zoneinfo’ (Solaris, AIX), .... It is likely
that there is a file named something like ‘zone1970.tab’ or
(older) ‘zone.tab’ under that directory listing the locations
known as time-zone names (but not for example EST5EDT
). See
also https://en.wikipedia.org/wiki/Zone.tab.
Where R was configured with option --with-internal-tzcode
(the default on Windows), the database at
file.path(R.home("share"), "zoneinfo")
is used by default: file
‘VERSION’ in that directory states the version. That option is
also the default on macOS but there whichever is more recent of the
system database at ‘/var/db/timezone/zoneinfo’ and that
distributed with R is used by default. Environment variable
TZDIR can be used to give the full path to a different
‘zoneinfo’ database: value "internal"
indicates the
database from the R sources and "macOS"
indicates the system
database. (Setting either of those values would not be recognized by
other software using TZDIR.)
Setting TZDIR is also supported by the native services on some
OSes, e.g. Linux using glibc
except in secure modes.
Time zones given by name (via environment variable TZ, in
tz
arguments to functions such as as.POSIXlt
and
perhaps the system time zone) are loaded from the currently selected
‘zoneinfo’ database.
On Windows only: An attempt is made (once only per session) to map Windows' idea of the current time zone to a location, following a version of http://unicode.org/repos/cldr/trunk/common/supplemental/windowsZones.xml with additional values deduced from the Windows Registry and documentation. It can be overridden by setting the TZ environment variable before any date-times are used in the session.
Most platforms support time zones of the form ‘Etc/GMT+n’ and ‘Etc/GMT-n’ (possibly also without prefix ‘Etc/’), which assume a fixed offset from UTC (hence no DST). Contrary to some expectations (but consistent with names such as ‘PST8PDT’), negative offsets are times ahead of (East of) UTC, positive offsets are times behind (West of) UTC.
Immediately prior to the advent of legislated time zones, most people used time based on their longitude (or that of a nearby town), known as ‘Local Mean Time’ and abbreviated as ‘LMT’ in the databases: in many countries that was codified with a specific name before the switch to a standard time. For example, Paris codified its LMT as ‘Paris Mean Time’ in 1891 (to be used throughout mainland France) and switched to ‘GMT+0’ in 1911.
Some systems (notably Linux) have a tzselect
command which
allows the interactive selection of a supported time zone name. On
systems using systemd
(notably Linux), the OS command
timedatectl list-timezones
will list all available time zone
names.
Warnings
There is a system-specific upper limit on the number of bytes in (abbreviated) time-zone names which can be as low as 6 (as required by POSIX). Some OSes allow the setting of time zones with names which exceed their limit, and that can crash the R session.
Information about future times is speculative (‘proleptic’): the database provides the best-known information based on current rules set by civil authorities. For the period 1900–1970 those rules (and which of any authority's rules were enacted) are often obscure, and the databases do get corrected frequently.
OlsonNames
tries to find an Olson database in known locations.
It might not succeed (when it returns an empty vector with a warning)
and even if it does it might not locate the database used by the
date-time code linked into R. Fortunately names are added rarely
and most databases are pretty complete. On the other hand, many names
which duplicate other named timezones have been moved to the
‘backward’ list – these are regarded as optional and omitted on
minimal installations. Similarly, there are timezones named in file
‘backzone’ which differ only from those in the main lists prior
to 1970 – these are usually included but may not be in minimalist
systems.
For many years, the legacy names EST5EDT
and PST8PDT
were portable, but musl
(the C runtime used by Alpine Linux)
does not use DST with those names.
How the system time zone is found – on Unix-alikes
This section is of background interest for users of a Unix-alike, but
may help if an NA
value is returned unexpectedly.
Commercial Unixen such as Solaris and AIX set TZ, so the value when R is started is used.
All other common platforms (Linux, macOS, *BSD) use similar schemes,
either derived from tzcode
(currently distributed from
https://www.iana.org/time-zones) or independently coded
(glibc
, musl-libc
). Such systems read the time-zone
information from a file ‘localtime’, usually under ‘/etc’
(but possibly under ‘/usr/local/etc’ or
‘/usr/local/etc/zoneinfo’). As the usual Linux manual page for
localtime
says
‘Because the time zone identifier is extracted from the symlink target name of ‘/etc/localtime’, this file may not be a normal file or hardlink.’
Nevertheless, some Linux distributions (including the one from which that quote was taken) or sysadmins have chosen to copy a time-zone file to ‘localtime’. For a non-symlink, the ultimate fallback is to compare that file to all files in the time-zone database.
Some Linux platforms provide two other mechanisms which are tried in turn before looking at ‘/etc/localtime’.
-
‘Modern’ Linux systems use
systemd
which provides mechanisms to set and retrieve the time zone (amongst other things). There is a commandtimedatectl
to give details. (Unfortunately RHEL/Centos 6.x were not ‘modern’.) Debian-derived systems since ca 2007 have supplied a file ‘/etc/timezone’. Its format is undocumented but empirically it contains a single line of text naming the time zone.
In each case a sanity check is performed that the time-zone name is the
name of a file in the time-zone database. (The systems probably use
the time-zone file (symlinked to) ‘/etc/localtime’, but the
Sys.timezone
code does not check that is the same as the named
file in the database. This is deliberate as they may be from
different dates.)
Note
Since 2007 there has been considerable disruption over changes to the timings of the DST transitions; these often have short notice and time-zone databases may not be up to date. (Morocco in 2013 announced a change to the end of DST at a day's notice. In 2023 there was chaos in Lebanon as the authorities changed their minds repeatedly and some changes were not widely implemented.)
There have also been changes to the ‘standard’ time with little notice (Kazakhstan switched to a single time zone in Mar 2024 with six weeks' notice), and to whether ‘summer’ or ‘winter’ time is regarded as ‘standard’ (and hence to abbreviations).
On platforms with case-insensitive file systems, time zone names will be
case-insensitive. They may or may not be on other platforms and so,
for example, "gmt"
is valid on some platforms and not on others.
Note that except where replaced, the operation of time zones is an OS service, and even where replaced a third-party database is used and can be updated (see the section on ‘Time zone names’). Incorrect results will never be an R issue, so please ensure that you have the courtesy not to blame R for them.
See Also
https://en.wikipedia.org/wiki/Time_zone and https://data.iana.org/time-zones/tz-link.html for extensive sets of links.
https://data.iana.org/time-zones/theory.html for the ‘rules’ of the Olson/IANA database.
Examples
Sys.timezone()
str(OlsonNames()) ## typically around six hundred names,
## typically some acronyms/aliases such as "UTC", "NZ", "MET", "Eire", ..., but
## mostly pairs (and triplets) such as "Pacific/Auckland"
table(sl <- grepl("/", OlsonNames()))
OlsonNames()[ !sl ] # the simple ones
head(Osl <- strsplit(OlsonNames()[sl], "/"))
(tOS1 <- table(vapply(Osl, `[[`, "", 1))) # Continents, countries, ...
table(lengths(Osl))# most are pairs, some triplets
str(Osl[lengths(Osl) >= 3])# "America" South and North ...