clean_CRAN_db {cranly} | R Documentation |
Clean and organize package and author names in the output of tools::CRAN_package_db()
Description
Clean and organize package and author names in the output of tools::CRAN_package_db()
Usage
clean_CRAN_db(
packages_db,
clean_directives = clean_up_directives,
clean_author = clean_up_author,
clean_maintainer = standardize_whitespace
)
Arguments
packages_db |
a |
clean_directives |
a function that transforms the contents of
the various directives in the package descriptions to vectors
of package names. Default is |
clean_author |
a function that transforms the contents of
|
clean_maintainer |
a function that transforms the contents of
|
Details
clean_CRAN_db()
uses clean_up_directives()
and
clean_up_author()
to clean up the author names and package names
in the various directives (like Imports
, Depends
, Suggests
,
Enhances
, LinkingTo
) as in the data.frame
that results from
tools::CRAN_package_db()
return an organized data.frame
of
class cranly_db
that can be used for further analysis.
The function tries hard to identify and eliminate mistakes in the
Author field of the description file, and extract a clean list of
only author names. The relevant operations are coded in the
clean_up_author()
function. Specifically, some references to
copyright holders had to go because they were contaminating the
list of authors (most are not necessary anyway, but that is a
different story...). The current version of clean_up_author()
is
far from best practice in using regex but it currently does a fair
job in cleaning up messy Author fields. It will be improving in
future versions.
Custom clean-up functions can also be supplied via the
clean_directives
and clean_author
arguments.
Value
A data.frame
with the same variables as package_db
(but with
lower case names), that also inherits from class_db
, and has a
timestamp
attribute.
Examples
## Download today's CRAN package database
cran_db <- tools::CRAN_package_db()
## Before clean up
cran_db[cran_db$Package == "weights", "Author"]
## After clean up
package_db <- clean_CRAN_db(cran_db)
package_db[package_db$package == "weights", "author"]