| abs_api {readabs} | R Documentation |
ABS.Stat API functions
Description
These experimental functions provide a minimal interface to the ABS.Stat API.
More information on the ABS.Stat API can be found on the ABS website
Note that an ABS.Stat 'dataflow' is like a table. A 'datastructure' contains metadata that describes the variables in the dataflow. To load data from the ABS.Stat API, you need to either:
Using
read_api_dataflows()you can get information on the available dataflowsUsing
read_api_datastructure()you can get metadata relating to a specific dataflow, including the variables available in each dataflowUsing
read_api()you can get the data belonging to a given dataflow.Using
read_api_url()you can get the data for a given query url generated using the online data viewer.
Usage
read_api_dataflows()
read_api(
id,
datakey = NULL,
start_period = NULL,
end_period = NULL,
version = NULL
)
read_api_url(url)
read_api_datastructure(id)
Arguments
id |
A dataflow id. Use |
datakey |
A named list matching filter variables to codes. All variables
with a |
start_period |
The start period (used to filter by time). This is inclusive. The supported formats are:
|
end_period |
The end period (used to filter on time). This is inclusive.
The supported formats are the same as for |
version |
A version number, if unspecified the latest version of the
dataset is used. Use |
url |
A complete query url |
Details
Note that the API enforces a reasonably strict gateway timeout policy. This
means that, if you're trying to access a reasonably large dataset, you will
need to filter it on the server side using the datakey. You might like to
review the data manually via the ABS website
to figure out what subset of the data you require.
Note, furthermore, that the datastructure contains a complete codebook for
the variables appearing in the relevant dataflow. Since some variables are
shared across multiple dataflows, this means that the datastructure
corresponding to a particular id may contain values for a given variable
which are not in the corresponding dataflow.
Value
A data.frame
Examples
## Not run:
# List available dataflows
read_api_dataflows()
# Say we want the "Estimated resident population, Country of birth"
# data flow, with the id ERP_COB. We load the data like this:
# Get full data set for a given flow by providing id and start period:
read_api("ERP_COB", start_period = 2020)
# In some cases, loading a whole dataflow (as above) won't work.
# For eg., the `ABS_C16_T10_SA` dataflow is very large,
# so the gateway will timeout if we try to collect the full data set
try(read_api("ABS_C16_T10_SA"))
# We need to filter the dataflow before downlaoding it.
# To figure out how to filter it, we get metadata ('datastructure').
ds <- read_api_datastructure("ABS_C16_T10_SA")
# The `asgs_2016` code for 'Australia' is 0
ds[ds$var == "asgs_2016" & ds$label == "Australia", ]
# The `sex_abs` code for 'Persons' (i.e. all persons) is 3
ds[ds$var == "sex_abs" & ds$label == "Persons", ]
# So we have:
x <- read_api("ABS_C16_T10_SA", datakey = list(asgs_2016 = 0, sex_abs = 3))
unique(x["asgs_2016"]) # Confirming only 'Australia' level records came through
unique(x["sex_abs"]) # Confirming only 'Persons' level records came through
# Please note however that not all values in the datastructure necessarily
# appear in the data. You get 404s in this case
ds[ds$var == "regiontype" & ds$label == "Destination Zones", ]
try(read_api("ABS_C16_T10_SA", datakey = list(regiontype = "DZN")))
# If you already have a query url, then use `read_api_url()`
wpi_url <- ""https://api.data.abs.gov.au/data/ABS,WPI/all""
read_api_url(wpi_url)
## End(Not run)