R: The netCoin package.

netCoin-package {netCoin}

R Documentation

The netCoin package.

Description

Create interactive networked coincidences. It joins the data analysis power of R to study coincidences and the visualization libraries of JavaScript in one package.

Details

Coincidence analysis detects what events, characters, objects, attributes, or characteristics tend to occur together within certain limits.

These given limits are call scenarios (S) and are considered to be the units of analysis, and as such they have to be placed in the rows of a matrix or data.frame.

In each i scenario, a series of J events X_j, which are to be represented as dichotomous variables X_{j} in columns, may occur (1) or may not occur (0). Scenarios and events constitute an incidence matrix (I).

Incidence matrix

	`X_1`	`X_2`	`X_3`	`\;` ...	`X_J`
`S_1`	0	1	0	...	1
`S_2`	1	0	1	...	0
...	...	...	...	...	...
`S_n`	1	1	0	...	1

From this incidences matrix, a coincidence (C) matrix can be obtained with the function coin. In this matrix the main diagonal represents frequencies of X_j, while the others elements are number of coincidences between two events.

Coincidence matrix

	`X_1`	`X_2`	`X_3`	`\;`	`X_J`
`X_1`	2	1	1	...	1
`X_2`	1	2	0	...	2
`X_3`	1	0	1	...	0
...	...	...	...	...	...
`X_J`	1	2	0	...	2

Once there is a coin object, a similarity matrix can be obtained. Similarity matrices available in netCoin are:

Matching (m), Rogers & Tanimoto (t) Gower (g) Sneath (s) and Anderberg (and).
Jaccard (j), dice (d), antiDice (a), Ochiai (o) and Kulczynski (k).
Hamann (ham), Yule (y), Pearson (p), odds ratio (od) and Rusell (r).

Other measures that can be obtained from coin are:

Relative frequencies (x), conditional frequencies (i) coincidence degree (cc) and probable degree of coincidence (cp).
Haberman (h) and Z value of Haberman (z)

To obtain similarity and other measures matrices, the function sim elaborates a list of them.

Similarity matrix

	`X_1`	`X_2`	`X_3`	`\;`	`X_J`
`X_1`	1.73	-.87	.87	...	-.87
`X_2`	-.87	1.73	-1.73	...	1.73
`X_3`	.87	-1.73	1.73	...	-1.73
...	...	...	...	...	...
`X_J`	-.87	1.73	-1.73	...	1.73

edgeList makes a collecion of edges composed by a list of similarity measures whenever a criterium (generally p(Z)<.50) is met.

Edge list

	source	target	Haberman	P(z)
1	X1	X3	0.8660254	0.22509243
2	X2	X4	1.7320508	0.09084506

In order to make a graph, two data frames are needed: a nodes data frames with names and other nodes attributes (see asNodes) and an edge data frame (see edgeList). For more information go to netCoin.

Author

Modesto Escobar, Department of Sociology and Communication, University of Salamanca. See https://sociocav.usal.es/blog/modesto-escobar/

References

Escobar, M. (2009): "Redes Semanticas en Textos Periodisticos: Propuestas Tecnicas para su Representacion", en Empiria, 17, 13-39.

Escobar, M.(2015): "Studying Coincidences with Network Analysis and Other Multivariate Tools", in The Stata Journal, 15(4), 1118-1156.

Escobar, M. and J. Gomez Isla (2015): "The Expression of Identity through the Image: The Photographic Archives of Miguel de Unamuno and Joaquin Turina", en Revista Espanola de Investigaciones Sociologicas, 152, 23-46.

[Package netCoin version 2.0.48 Index]