| synthesisr {synthesisr} | R Documentation |
synthesisr: Import, assemble, and deduplicate bibiliographic datasets
Description
Systematic review searches include multiple databases that export results in a variety of formats with overlap in coverage between databases. To streamline the process of importing, assembling, and deduplicating results, synthesisr recognizes bibliographic files exported from databases commonly used for systematic reviews and merges results into a standardized format.
Import & Export
The key task performed by synthesisr is flexible import and presentation of bibliographic data. This is typically achieved by read_refs, which can import multiple files at once and link them together into a single data.frame. Conversely, export is via write_refs. Users that require more detailed control can use the following functions:
-
detect_Detect file attributes -
parse_Parse a vector containing bibliographic data -
clean_Cleaning functions for author and column names -
code_lookupA dataset of potential ris tags
Data formatting
-
bibliography-classMethods for class 'bibliography' -
merge_columnsrbind two data.frames with different numbers of columns -
format_citationReturn a clean citation from a bibliography or data.frame -
add_line_breaksSet a maximum character width for strings
Deduplication
When importing from multiple databases, it is likely that there will be duplicates in the resulting dataset. The easiest way to deal with this problem in synthesisr is using the deduplicate command; but this can be risky, particularly if there are no DOIs in the dataset. To get finer control of the deduplication process, consider using the sub-functions:
-
find_duplicatesLocate potentially duplicated references -
extract_unique_referencesReturn a data.frame with only 'unique' references -
review_duplicatesManually review potential duplicates -
override_duplicatesManually override identified duplicates -
fuzz_Fuzzy string matching c/o 'fuzzywuzzy' -
string_Fuzzy string matching c/ostringdist