create_bibliography {Diderot} | R Documentation |
Function to create a bibliographic dataset
Description
This function creates a bibliographic dataset based on two external corpus files, each representing the bibliography of a given domain.
Usage
create_bibliography(corpora_files, labels, keywords, retrieve_pubdates = F,
clean_refs = F, encoding = NULL)
Arguments
corpora_files |
Vector containing the pathes to two corpus files (e.g. Scopus exports). The CSV files should contain for each record at least Authors (comma separated), Publication Title, Publication Year, and References (semicolon separated). The inclusion of DOI (for date checking; see the retrieve_pubdates option) as well as Abstract, Author.Keywords, and Index.Keywords (for the in-depth identification of publications belonging to both corpora) are strongly recommended. |
labels |
Labels (i.e. names) given to the two corpora to be analyzed. |
keywords |
Keywords identifying the two corpora |
retrieve_pubdates |
Flag indicating whether to confirm publication dates by retrieving them (see |
clean_refs |
Attempt to clean references and keep titles only. NOT RECOMMENDED, especially if |
encoding |
Character encoding used in the input files. |
Value
Returns a dataframe containing a bibliographic dataset usable by Diderot and including all references from both corpora.
Author(s)
Christian Vincenot (christian@vincenot.biz)
See Also
Examples
## Not run:
# Two corpora on individual-based modelling (IBM) and agent-based modelling (ABM)
# were downloaded from Scopus. The structure of each corpus is as follows:
tt<-read.csv("IBMmerged.csv", stringsAsFactors=FALSE)
str(tt,strict.width="cut")
### 'data.frame': 3184 obs. of 9 variables:
### $ Authors : chr "Chen J., Marathe A., Marathe M." "Van Dijk D., Sl"..
### $ Title : chr "Coevolution of epidemics, social networks, and in"..
### $ Year : int 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ...
### $ DOI : chr "10.1007/978-3-642-12079-4_28" "10.1016/j.procs.20"..
### $ Link : chr "http://www.scopus.com/inward/record.url?eid=2-s2."..
### $ Abstract : chr "This research shows how a limited supply of antiv"..
### $ Author.Keywords: chr "Antiviral; Behavioral economics; Epidemic; Microe"..
### $ Index.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microe"..
### $ References : chr "(2009) Centre Approves Restricted Retail Sale of "..
# Define the name of corpora (labels) and specific keywords to identify relevant
# publications (keys).
labels<-c("IBM","ABM")
keys<-c("individual-based model|individual based model",
"agent-based model|agent based model")
# Build the IBM-ABM bibliographical dataset from Scopus exports
db<-create_bibliography(corpora_files=c("IBMmerged.csv","ABMmerged.csv"),
labels=labels, keywords=keys)
### [1] "File IBMmerged.csv contains 3184 records"
### [1] "File ABMmerged.csv contains 9641 records"
# Processed output. Note the field name changes (for standardization with ISI Web
# of Knowledge format) and addition of the "Corpus" field (with identification of
# joint "IBM | ABM" publications based on keywords).
str(db, strict.width="cut")
### 'data.frame': 12504 obs. of 10 variables:
### $ Authors : chr "Chen J., Marathe A., Marathe M." "Van Dijk D., Sloot "..
### $ Cite Me As : chr "Coevolution of epidemics, social networks, and indivi"..
### $ Year : int 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ...
### $ DOI : chr "10.1007/978-3-642-12079-4_28" "10.1016/j.procs.2010.0"..
### $ Link : chr "http://www.scopus.com/inward/record.url?eid=2-s2.0-78"..
### $ Abstract : chr "This research shows how a limited supply of antiviral"..
### $ Author.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microecono"..
### $ Index.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microecono"..
### $ Cited References: chr "(2009) Centre Approves Restricted Retail Sale of Tami"..
### $ Corpus : chr "IBM" "IBM | ABM" "IBM | ABM" "IBM" ...
## End(Not run)