Diderot-package {Diderot} | R Documentation |
Bibliographic Network Analysis Package
Description
Denis Diderot (1713-1784), French philosopher and co-founder of the modern encyclopedia. |
This package allows to detect and quantify the unification or separation of two bibliographic corpora through the creation of citation networks. This tool can be used to study the spread of concepts across scientific disciplines, or the fusion/fission of scientific communities.
Details
Package: | Diderot |
Type: | Package |
Version: | 0.13 |
Date: | 2020-04-17 |
License: | GPL (>=2) |
A typical flow of use of the package includes the following points.
First, literature metadata, including references, from the two fields of studies to analyze are downloaded from Scopus (or built manually). This data is imported to create a bibliographic dataset using create_bibliography
.
Second, a graph is created with a call to build_graph
to reproduce the citation network in the bibliographic dataset.
Finally, statistical analysis can be performed on the graph to assess the fusion/fission state of the two corpora/communities. Heterocitation indices (i.e. share and balance) show how much publications or authors cite papers from the other corpus (see heterocitation
and heterocitation_authors
respectively). Such analysis shall always be preceded by a call to precompute_heterocitation
to perform initial calculations. These metrics are completed by traditional as well as custom modularity metrics (see compute_modularity
and compute_custom_modularity
respectively) that translate how much the communities are separated. Publications that foster mutual awareness and cross-fertilization between the corpora/communities can be identified using the usual betweeness centrality metric (see compute_BC_ranking
) and the Ji index (see compute_Ji_ranking
).
Author(s)
Christian Vincenot
Maintainer: Christian Vincenot (christian@vincenot.biz)
See Also
Examples
## Not run:
# Two corpora on individual-based modelling (IBM) and agent-based modelling (ABM)
# were downloaded from Scopus. The structure of each corpus is as follows:
tt<-read.csv("IBMmerged.csv", stringsAsFactors=FALSE)
str(tt,strict.width="cut")
### 'data.frame': 3184 obs. of 9 variables:
### $ Authors : chr "Chen J., Marathe A., Marathe M." "Van Dijk D., Sl"..
### $ Title : chr "Coevolution of epidemics, social networks, and in"..
### $ Year : int 2010 2010 2010 2010 2010 2010 2010 2010 2010 2010 ...
### $ DOI : chr "10.1007/978-3-642-12079-4_28" "10.1016/j.procs.20"..
### $ Link : chr "http://www.scopus.com/inward/record.url?eid=2-s2."..
### $ Abstract : chr "This research shows how a limited supply of antiv"..
### $ Author.Keywords: chr "Antiviral; Behavioral economics; Epidemic; Microe"..
### $ Index.Keywords : chr "Antiviral; Behavioral economics; Epidemic; Microe"..
### $ References : chr "(2009) Centre Approves Restricted Retail Sale of "..
# Define the name of corpora (labels) and specific keywords to identify relevant
# publications (keys).
labels<-c("IBM","ABM")
keys<-c("individual-based model|individual based model",
"agent-based model|agent based model")
# Build the IBM-ABM bibliographical dataset from Scopus exports
db<-create_bibliography(corpora_files=c("IBMmerged.csv","ABMmerged.csv"),
labels=labels, keywords=keys)
### [1] "File IBMmerged.csv contains 3184 records"
### [1] "File ABMmerged.csv contains 9641 records"
# Build and save citation graph
gr<-build_graph(db=db,small.year.mismatch=T,fine.check.nb.authors=2,
attrs=c("Corpus","Year","Authors", "DOI"))
### [1] "Graph built! Execution time: 1200.22 seconds."
save_graph(gr, "graph.graphml")
# Compute and plot modularity
compute_modularity(gr_sx, 1987, 2018)
###[1] 0.3164805
plot_modularity_timeseries(gr_sx, 1987, 2018, window=1000)
# Compute and plot publication heterocitation
gr_sx<-precompute_heterocitation(gr,labels=labels,infLimitYear=1987, supLimitYear=2018)
###[1] "Summary of the nodes considered for computation (1987-2017)"
###[1] "-----------------------------------------------------------"
###[1] "IBM ABM IBM|ABM"
###[1] "1928 5378 153"
###[1]
###[1] "Edges summary"
###[1] "-------------"
###[1] "IBM->IBM/IBM->Other 5583/1086 => Prop 0.163"
###[1] "ABM->ABM/ABM->Other 16946/2665 => Prop 0.136"
###[1] "General Same/Diff 22529/3751 => Prop 0.143"
###[1]
###[1] "Heterocitation metrics"
###[1] "----------------------"
###[1] "Sx ALL / IBM / ABM"
###[1] "0.127 / 0.137 / 0.124"
###[1] "Dx ALL / IBM / ABM"
###[1] "-0.652 / -0.803 / -0.598"
heterocitation(gr_sx, labels=labels, 1987, 2005)
###[1] "Sx ALL / ABM / IBM"
###[1] "0.047 / 0.214 / 0.007"
###[1] "Dx ALL / ABM / IBM"
###[1] "-0.927 / -0.690 / -0.982"
plot_heterocitation_timeseries(gr_sx, labels=labels, mini=-1, maxi=-1, cesure=2005)
# Compute author heterocitation
hetA<-heterocitation_authors(gr_sx, 1987, 2018, pub_threshold=4)
head(hetA[order(hetA$avgDx,decreasing=T),c(1)], n=10)
### [1] "Ashlock D." "Evora J." "Hernandez J.J." "Hernandez M." "Gooch K.J."
### [6] "Reinhardt J.W." "Ng K." "Kazanci C." "Senior A.M." "Ariel G."
# Try to figure which publication are most impactful in terms of cross-fertilization
jir<-compute_Ji_ranking(gr_sx, labels=labels, 1987, 2018)
head(jir[,c(2,7)],n=3)
### Title Ji
### 758 A standard protocol for describing individual-based and agent-based models 200
### 4437 Pattern-oriented modeling of agent-based complex systems: Lessons from ecology 134
### 33 The ODD protocol: A review and first update 120
## End(Not run)