CRAN Task View: Databases with R

Maintainer:Yuan Tang, James Joseph Balamuta
Contact:terrytangyuan at gmail.com
Version:2023-02-23
URL:https://CRAN.R-project.org/view=Databases
Source:https://github.com/cran-task-views/Databases/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.
Citation:Yuan Tang, James Joseph Balamuta (2023). CRAN Task View: Databases with R. Version 2023-02-23. URL https://CRAN.R-project.org/view=Databases.
Installation:The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("Databases", coreOnly = TRUE) installs all the core packages or ctv::update.views("Databases") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details.

This CRAN task view contains a list of packages related to accessibility of different databases. This does not include data import/export or data management. Moreover, the task view on HighPerformanceComputing and MachineLearning might provide useful information.

As datasets become larger and larger, it is impossible for people to save them in traditional file formats such as spreadsheet, raw text file, etc., which could not fit on devices with limited storage and could not be easily shared across collaborators. Instead, people nowadays tend to store data in databases for more scalable and reliable data management.

Database systems are often classified based on the database models that they support. Relational databases became dominant in the 1980s. The data in relational databases is modeled as rows and columns in a series of tables with the use of SQL to express the logic for writing and querying data. The tables are relational, e.g. you have a user who uses your softwares and those softwares have creators and contributors. Non-relational databases became popular in recent years due to huge demand in storing unstructured data with the use of NoSQL as the query language. Users generally don’t need to define the data schema up front. If there are changing requirements in the applications, non-relational databases can be much easier to use and manage.

The content presented in this task view is undergoing rapid changes in industries and academia. Please send any suggestions to the maintainer via e-mail or submit an issue or pull request in the GitHub repository linked above. All suggestions and corrections by others are gratefully acknowledged.

Relational databases

This section includes packages that provides access to relational databases within R.

Non-relational databases

This section includes packages that provides access to non-relational databases within R.

Database tools

This section includes packages that provides tools for working and testing with databases, database table manipulations, etc.

CRAN packages

Core:DBI, odbc, RODBC.
Regular:bigrquery, DatabaseConnector, DBItest, dbplyr, dbx, dittodb, dplyr, duckdb, elastic, filehashSQLite, Hmisc, implyr, influxdbr, liteq, mongolite, MSSQL, octopus, paws, paws.database, pointblank, pool, R4CouchDB, R6, RcppRedis, redux, RGreenplum, RH2, RJDBC, RMariaDB, RMySQL, rocker, ROracle, rpostgis, RPostgres, RPostgreSQL, RPresto, RSQLite, sparklyr, sqldf, SQRL, tfio, uptasticsearch.

Related links

Other resources