synthesisr {synthesisr} | R Documentation |
synthesisr: Import, assemble, and deduplicate bibiliographic datasets
Description
Systematic review searches include multiple databases that export results in a variety of formats with overlap in coverage between databases. To streamline the process of importing, assembling, and deduplicating results, synthesisr recognizes bibliographic files exported from databases commonly used for systematic reviews and merges results into a standardized format.
Import & Export
The key task performed by synthesisr
is flexible import and presentation of bibliographic data. This is typically achieved by read_refs
, which can import multiple files at once and link them together into a single data.frame
. Conversely, export is via write_refs
. Users that require more detailed control can use the following functions:
-
detect_
Detect file attributes -
parse_
Parse a vector containing bibliographic data -
clean_
Cleaning functions for author and column names -
code_lookup
A dataset of potential ris tags
Data formatting
-
bibliography-class
Methods for class 'bibliography' -
merge_columns
rbind two data.frames with different numbers of columns -
format_citation
Return a clean citation from a bibliography or data.frame -
add_line_breaks
Set a maximum character width for strings
Deduplication
When importing from multiple databases, it is likely that there will be duplicates in the resulting dataset. The easiest way to deal with this problem in synthesisr
is using the deduplicate
command; but this can be risky, particularly if there are no DOIs in the dataset. To get finer control of the deduplication process, consider using the sub-functions:
-
find_duplicates
Locate potentially duplicated references -
extract_unique_references
Return a data.frame with only 'unique' references -
review_duplicates
Manually review potential duplicates -
override_duplicates
Manually override identified duplicates -
fuzz_
Fuzzy string matching c/o 'fuzzywuzzy' -
string_
Fuzzy string matching c/ostringdist