Plot Highest Word Probabilities for Each Topic — plot_word

This function provides a visualization of the top terms for each topic, ordered by their word probability distribution for each topic (beta).

Usage

plot_word_probabilities(
  dfm_object,
  topic_n,
  max.em.its = 75,
  categorical_var = NULL,
  continuous_var = NULL,
  top_term_n = 10,
  ncol = 3,
  topic_names = NULL,
  height = 1200,
  width = 800,
  verbose = TRUE,
  ...
)

Arguments

dfm_object: A quanteda document-feature matrix (dfm).
topic_n: The number of topics to display.
max.em.its: Maximum number of EM iterations (default: 75).
categorical_var: An optional character string for a categorical variable in the metadata.
continuous_var: An optional character string for a continuous variable in the metadata.
top_term_n: The number of top terms to display for each topic (default: 10).
ncol: The number of columns in the facet plot (default: 3).
topic_names: An optional character vector for labeling topics. If provided, must be the same length as the number of topics.
height: The height of the resulting Plotly plot, in pixels. Defaults to 1200.
width: The width of the resulting Plotly plot, in pixels. Defaults to 800.
verbose: Logical; if TRUE, prints progress information.
...: Further arguments passed to stm::searchK.

Value

A Plotly object showing a facet-wrapped chart of top terms for each topic, ordered by their per-topic probability (beta). Each facet represents a topic.

Details

If topic_names is provided, it replaces the default "Topic {n}" labels with custom names.

Examples

if (interactive()) {
  df <- TextAnalysisR::SpecialEduTech
  united_tbl <- TextAnalysisR::unite_text_cols(df, listed_vars = c("title", "keyword", "abstract"))
  tokens <- TextAnalysisR::preprocess_texts(united_tbl, text_field = "united_texts")
  dfm_object <- quanteda::dfm(tokens)
TextAnalysisR::plot_word_probabilities(
  dfm_object = dfm_object,
  topic_n = 15,
  max.em.its = 75,
  categorical_var = "reference_type",
  continuous_var = "year",
  top_term_n = 10,
  ncol = 3,
  height = 1200,
  width = 800,
  verbose = TRUE)
}