netCoin-package {netCoin} | R Documentation |
The netCoin package.
Description
Create interactive networked coincidences. It joins the data analysis power of R to study coincidences and the visualization libraries of JavaScript in one package.
Details
Coincidence analysis detects what events, characters, objects, attributes, or characteristics tend to occur together within certain limits.
These given limits are call scenarios (S
) and are considered to be the units of analysis, and as such they have to be placed in the rows of a matrix or data.frame.
In each i
scenario, a series of J
events X_j
, which are to be represented as dichotomous variables X_{j}
in columns, may occur (1) or may not occur (0). Scenarios and events constitute an incidence matrix (I).
Incidence matrix
X_1 | X_2 | X_3 | \; ... | X_J |
|
S_1 | 0 | 1 | 0 | ... | 1 |
S_2 | 1 | 0 | 1 | ... | 0 |
... | ... | ... | ... | ... | ... |
S_n | 1 | 1 | 0 | ... | 1 |
From this incidences matrix, a coincidence (C) matrix can be obtained with the function coin
. In this matrix the main diagonal represents frequencies of X_j
, while the others elements are number of coincidences between two events.
Coincidence matrix
X_1 | X_2 | X_3 | \; | X_J |
|
X_1 | 2 | 1 | 1 | ... | 1 |
X_2 | 1 | 2 | 0 | ... | 2 |
X_3 | 1 | 0 | 1 | ... | 0 |
... | ... | ... | ... | ... | ... |
X_J | 1 | 2 | 0 | ... | 2 |
Once there is a coin
object, a similarity matrix can be obtained. Similarity matrices available in netCoin are:
Matching (m), Rogers & Tanimoto (t) Gower (g) Sneath (s) and Anderberg (and).
Jaccard (j), dice (d), antiDice (a), Ochiai (o) and Kulczynski (k).
Hamann (ham), Yule (y), Pearson (p), odds ratio (od) and Rusell (r).
Other measures that can be obtained from coin
are:
Relative frequencies (x), conditional frequencies (i) coincidence degree (cc) and probable degree of coincidence (cp).
Haberman (h) and Z value of Haberman (z)
To obtain similarity and other measures matrices, the function sim
elaborates a list of them.
Similarity matrix
X_1 | X_2 | X_3 | \; | X_J |
|
X_1 | 1.73 | -.87 | .87 | ... | -.87 |
X_2 | -.87 | 1.73 | -1.73 | ... | 1.73 |
X_3 | .87 | -1.73 | 1.73 | ... | -1.73 |
... | ... | ... | ... | ... | ... |
X_J | -.87 | 1.73 | -1.73 | ... | 1.73 |
edgeList
makes a collecion of edges composed by a list of similarity measures whenever a criterium (generally p(Z)<.50) is met.
Edge list
source | target | Haberman | P(z) | |
1 | X1 | X3 | 0.8660254 | 0.22509243 |
2 | X2 | X4 | 1.7320508 | 0.09084506 |
In order to make a graph, two data frames are needed: a nodes data frames with names and other nodes attributes (see asNodes
) and an edge data frame (see edgeList
). For more information go to netCoin
.
Author
Modesto Escobar, Department of Sociology and Communication, University of Salamanca. See https://sociocav.usal.es/blog/modesto-escobar/
References
Escobar, M. (2009): "Redes Semanticas en Textos Periodisticos: Propuestas Tecnicas para su Representacion", en Empiria, 17, 13-39.
Escobar, M.(2015): "Studying Coincidences with Network Analysis and Other Multivariate Tools", in The Stata Journal, 15(4), 1118-1156.
Escobar, M. and J. Gomez Isla (2015): "The Expression of Identity through the Image: The Photographic Archives of Miguel de Unamuno and Joaquin Turina", en Revista Espanola de Investigaciones Sociologicas, 152, 23-46.