R: Creating and plotting conceptual structure map of a...

conceptualStructure {bibliometrix}

R Documentation

Creating and plotting conceptual structure map of a scientific field

Description

The function conceptualStructure creates a conceptual structure map of a scientific field performing Correspondence Analysis (CA), Multiple Correspondence Analysis (MCA) or Metric Multidimensional Scaling (MDS) and Clustering of a bipartite network of terms extracted from keyword, title or abstract fields.

Usage

conceptualStructure(
  M,
  field = "ID",
  ngrams = 1,
  method = "MCA",
  quali.supp = NULL,
  quanti.supp = NULL,
  minDegree = 2,
  clust = "auto",
  k.max = 5,
  stemming = FALSE,
  labelsize = 10,
  documents = 2,
  graph = TRUE,
  remove.terms = NULL,
  synonyms = NULL
)

Arguments

M

is a data frame obtained by the converting function convert2df. It is a data matrix with cases corresponding to articles and variables to Field Tag in the original ISI or SCOPUS file.

field

is a character object. It indicates one of the field tags of the standard ISI WoS Field Tag codify. field can be equal to one of these tags:

`ID`		Keywords Plus associated by ISI or SCOPUS database
`DE`		Author's keywords
`ID_TM`		Keywords Plus stemmed through the Porter's stemming algorithm
`DE_TM`		Author's Keywords stemmed through the Porter's stemming algorithm
`TI`		Terms extracted from titles
`AB`		Terms extracted from abstracts

ngrams

is an integer between 1 and 3. It indicates the type of n-gram to extract from texts. An n-gram is a contiguous sequence of n terms. The function can extract n-grams composed by 1, 2, 3 or 4 terms. Default value is ngrams=1.

method

is a character object. It indicates the factorial method used to create the factorial map. Use method="CA" for Correspondence Analysis, method="MCA" for Multiple Correspondence Analysis or method="MDS" for Metric Multidimensional Scaling. The default is method="MCA"

quali.supp

is a vector indicating the indexes of the categorical supplementary variables. It is used only for CA and MCA.

quanti.supp

is a vector indicating the indexes of the quantitative supplementary variables. It is used only for CA and MCA.

minDegree

is an integer. It indicates the minimum occurrences of terms to analyze and plot. The default value is 2.

clust

is an integer or a character. If clust="auto", the number of cluster is chosen automatically, otherwise clust can be an integer between 2 and 8.

k.max

is an integer. It indicates the maximum number of cluster to keep. The default value is 5. The max value is 20.

stemming

is logical. If TRUE the Porter's Stemming algorithm is applied to all extracted terms. The default is stemming = FALSE.

labelsize

is an integer. It indicates the label size in the plot. Default is labelsize=10

documents

is an integer. It indicates the number of documents per cluster to plot in the factorial map. The default value is 2. It is used only for CA and MCA.

graph

is logical. If TRUE the function plots the maps otherwise they are saved in the output object. Default value is TRUE

remove.terms

is a character vector. It contains a list of additional terms to delete from the documents before term extraction. The default is remove.terms = NULL.

synonyms

is a character vector. Each element contains a list of synonyms, separated by ";", that will be merged into a single term (the first word contained in the vector element). The default is synonyms = NULL.

Value

It is an object of the class list containing the following components:

net		bipartite network
res		Results of CA, MCA or MDS method
km.res		Results of cluster analysis
graph_terms		Conceptual structure map (class "ggplot2")
graph_documents_Contrib		Factorial map of the documents with the highest contributes (class "ggplot2")
graph_docuemnts_TC		Factorial map of the most cited documents (class "ggplot2")

Examples

# EXAMPLE Conceptual Structure using Keywords Plus

data(scientometrics, package = "bibliometrixData")

CS <- conceptualStructure(scientometrics, field="ID", method="CA", 
             stemming=FALSE, minDegree=3, k.max = 5)