plot_perplexity {miRetrieve}R Documentation

Plot perplexity score of various LDA models

Description

Plot perplexity score of various LDA models.

Usage

plot_perplexity(
  df,
  start = 2,
  end = 5,
  stopwords = stopwords_miretrieve,
  method = "gibbs",
  control = NULL,
  col.abstract = Abstract,
  col.pmid = PMID,
  title = NULL
)

Arguments

df

Data frame containing abstracts and PubMed-IDs.

start

Integer. Minimum amount of k topics for the LDA model to fit. Must be >=2.

end

Integer. Maximum amount of k topics for the LDA model to fit.

stopwords

Data frame containing stop words.

method

String. Either "gibbs" or "VEM".

control

Control parameters for LDA modeling. For more information, see the documentation of the LDAcontrol class in the topicmodels package.

col.abstract

Column containing abstracts.

col.pmid

Column containing PubMed-ID.

title

String. Plot title.

Details

Plot perplexity score of various LDA models. plot_perplexity() fits different LDA models for k topics in the range between start and end. For each LDA model, the perplexity score is plotted against the corresponding value of k. Plotting the perplexity score of various LDA models can help in identifying the optimal number of topics to fit an LDA model for. plot_perplexity() is based on LDA() from the package topicmodels.

Value

Elbow plot displaying perplexity scores of different LDA models.

See Also

fit_lda()

Other LDA functions: assign_topic_lda(), fit_lda(), plot_lda_term()


[Package miRetrieve version 1.3.4 Index]