dbConvertTable {RAthena} | R Documentation |
Simple wrapper to convert Athena backend file types
Description
Utilises AWS Athena to convert AWS S3 backend file types. It also also to create more efficient file types i.e. "parquet" and "orc" from SQL queries.
Usage
dbConvertTable(conn, obj, name, ...)
## S4 method for signature 'AthenaConnection'
dbConvertTable(
conn,
obj,
name,
partition = NULL,
s3.location = NULL,
file.type = c("NULL", "csv", "tsv", "parquet", "json", "orc"),
compress = TRUE,
data = TRUE,
...
)
Arguments
conn |
An |
obj |
Athena table or |
name |
Name of destination table |
... |
Extra parameters, currently not used |
partition |
Partition Athena table |
s3.location |
location to store output file, must be in s3 uri format for example ("s3://mybucket/data/"). |
file.type |
File type for |
compress |
Compress |
data |
If |
Value
dbConvertTable()
returns TRUE
but invisible.
Examples
## Not run:
# Note:
# - Require AWS Account to run below example.
# - Different connection methods can be used please see `RAthena::dbConnect` documnentation
library(DBI)
library(RAthena)
# Demo connection to Athena using profile name
con <- dbConnect(athena())
# write iris table to Athena in defualt delimited format
dbWriteTable(con, "iris", iris)
# convert delimited table to parquet
dbConvertTable(con,
obj = "iris",
name = "iris_parquet",
file.type = "parquet"
)
# Create partitioned table from non-partitioned
# iris table using SQL DML query
dbConvertTable(con,
obj = SQL("select
iris.*,
date_format(current_date, '%Y%m%d') as time_stamp
from iris"),
name = "iris_orc_partitioned",
file.type = "orc",
partition = "time_stamp"
)
# disconnect from Athena
dbDisconnect(con)
## End(Not run)