biblio_coupling {biblionetwork}R Documentation

Calculating the Coupling Angle Measure for Edges

Description

This function calculates the number of references that different articles share together, as well as the coupling angle value of edges in a bibliographic coupling network (Sen and Gan 1983), from a direct citation data frame. This is a standard way to build bibliographic coupling network using Salton's cosine measure: it divides the number of references that two articles share by the square root of the product of both articles bibliography lengths. It avoids giving too much importance to articles with a large bibliography.

Usage

biblio_coupling(
  dt,
  source,
  ref,
  normalized_weight_only = TRUE,
  weight_threshold = 1,
  output_in_character = TRUE
)

Arguments

dt

For bibliographic coupling (or co-citation), the dataframe with citing and cited documents. It could also be used

  1. for title co-occurence network, with source being the articles, and ref being the list of words in articles titles;

  2. for co-authorship network, with source being the authors, and ref the list of articles.

source

The column name of the source identifiers, that is the documents that are citing. In a coupling network, these documents are the nodes of the network.

ref

The column name of the cited references identifiers.

normalized_weight_only

If set to FALSE, the function returns the weights normalized by the cosine measure, but also the number of shared references.

weight_threshold

Corresponds to the value of the non-normalized weights of edges. The function just keeps the edges that have a non-normalized weight superior to the weight_threshold. In other words, if you set the parameter to 2, the function keeps only the edges between nodes that share at least two references in common in their bibliography. In a large bibliographic coupling network, you can consider for instance that sharing only one reference is not sufficient/significant for two articles to be linked together. This parameter could also be modified to avoid creating intractable networks with too many edges.

output_in_character

If TRUE, the function ends by transforming the from and to columns in character, to make the creation of a tidygraph network easier.

Details

This function implements the following weight measure:

\frac{R(A) \bullet R(B)}{\sqrt{L(A).L(B)}}

with R(A) and R(B) the references of document A and document B, R(A) \bullet R(B) being the number of shared references by A and B, and L(A) and L(B) the length of the bibliographies of document A and document B.

This function uses data.table package and is thus very fast. It allows the user to compute the coupling angle on a very large data frame quickly.

This function is a relatively general function that can also be used

  1. for co-citation networks (just by inversing the source and ref columns). If you want to avoid confusion, rather use the biblio_cocitation() function.

  2. for title co-occurence networks (taking care of the length of the title thanks to the coupling angle measure);

  3. for co-authorship networks (taking care of the number of co-authors an author has collaborated with on a period). For co-authorship, rather use the coauth_network() function.

Value

A data.table with the articles (or authors) identifiers in from and to columns, with one or two additional columns (the coupling angle measure and the number of shared references). It also keeps a copy of from and to in the Source and Target columns. This is useful is you are using the tidygraph package after, where from and to values are modified when creating a graph.

References

Sen SK, Gan SK (1983). “A Mathematical Extension of the Idea of Bibliographic Coupling and Its Applications.” Annals of library science and documentation, 30(2). http://nopr.niscair.res.in/bitstream/123456789/28008/1/ALIS%2030(2)%2078-82.pdf.

Examples

library(biblionetwork)
biblio_coupling(Ref_stagflation,
source = "Citing_ItemID_Ref",
ref = "ItemID_Ref",
weight_threshold = 3)


[Package biblionetwork version 0.1.0 Index]