| taxonomizrSwitch {taxonomizr} | R Documentation |
Switch from data.table to SQLite
Description
In version 0.5.0, taxonomizr switched from data.table to SQLite name and node lookups. See below for more details.
Details
Version 0.5.0 marked a change for name and node lookups from using data.table to using SQLite. This was necessary to increase performance (10-100x speedup for getTaxonomy) and create a simpler interface (a single SQLite database contains all necessary data). Unfortunately, this switch requires a couple breaking changes:
-
getTaxonomychanges fromgetTaxonomy(ids,namesDT,nodesDT)togetTaxonomy(ids,sqlFile) -
getIdchanges fromgetId(taxa,namesDT)togetId(taxa,sqlFile) -
read.namesis deprecated, instead useread.names.sql. For example, instead of callingnames<-read.names('names.dmp')in every session, simply callread.names.sql('names.dmp','accessionTaxa.sql')once (or use the convenientprepareDatabase)). -
read.nodesis deprecated, instead useread.names.sql. For example. instead of callingnodes<-read.names('nodes.dmp')in every session, simply callread.nodes.sql('nodes.dmp','accessionTaxa.sql')once (or use the convenientprepareDatabase).
I've tried to ease any problems with this by overloading getTaxonomy and getId to still function (with a warning) if passed a data.table names and nodes argument and providing a simpler prepareDatabase function for completing all setup steps (hopefully avoiding direct calls to read.names and read.nodes for most users).
I plan to eventually remove data.table functionality to avoid a split codebase so please switch to the new SQLite format in all new code.
See Also
getTaxonomy, read.names.sql, read.nodes.sql, prepareDatabase, getId