R: Compute the relatedness between entities (industries,...

relatedness {EconGeo}

R Documentation

Compute the relatedness between entities (industries, technologies, ...) from their co-occurence matrix

Description

This function computes the relatedness between entities (industries, technologies, ...) from their co-occurence (adjacency) matrix. Different normalization procedures are proposed following van Eck and Waltman (2009): association strength, cosine, Jaccard, and an adapted version of the association strength that we refer to as probability index.

Usage

relatedness(mat, method = "prob")

Arguments

`mat`	An adjacency matrix of co-occurences between entities (industries, technologies, cities...)
`method`	Which normalization method should be used to compute relatedness? Defaults to "prob", but it can be "association", "cosine" or "Jaccard"

Value

A matrix representing the relatedness between entities (industries, technologies, etc.) based on their co-occurrence matrix. The specific method of normalization used is determined by the 'method' parameter, which can be "prob" (probability index), "association" (association strength), "cosine" (cosine similarity), or "jaccard" (Jaccard index).

Author(s)

Pierre-Alexandre Balland p.balland@uu.nl
Joan Crespo J.Crespo@uu.nl
Mathieu Steijn M.P.A.Steijn@uu.nl

References

van Eck, N.J. and Waltman, L. (2009) How to normalize cooccurrence data? An analysis of some well-known similarity measures, Journal of the American Society for Information Science and Technology 60 (8): 1635-1651

Boschma, R., Heimeriks, G. and Balland, P.A. (2014) Scientific Knowledge Dynamics and Relatedness in Bio-Tech Cities, Research Policy 43 (1): 107-114

Hidalgo, C.A., Klinger, B., Barabasi, A. and Hausmann, R. (2007) The product space conditions the development of nations, Science 317: 482-487

Balland, P.A. (2016) Relatedness and the Geography of Innovation, in: R. Shearmur, C. Carrincazeaux and D. Doloreux (eds) Handbook on the Geographies of Innovation. Northampton, MA: Edward Elgar

Steijn, M.P.A. (2017) Improvement on the association strength: implementing probability measures based on combinations without repetition, Working Paper, Utrecht University

Examples

## generate an industry - industry matrix in which cells give the number of co-occurences
## between two industries
set.seed(31)
mat <- matrix(sample(0:10, 36, replace = TRUE), ncol = 6)
mat[lower.tri(mat, diag = TRUE)] <- t(mat)[lower.tri(t(mat), diag = TRUE)]
rownames(mat) <- c("I1", "I2", "I3", "I4", "I5", "I6")
colnames(mat) <- c("I1", "I2", "I3", "I4", "I5", "I6")

## run the function
relatedness(mat)
relatedness(mat, method = "association")
relatedness(mat, method = "cosine")
relatedness(mat, method = "jaccard")

[Package EconGeo version 2.0 Index]