Plot Highest Word Probabilities for Each Topic
Source:R/text_mining_functions.R
plot_word_probabilities.Rd
This function provides a visualization of the top terms for each topic, ordered by their word probability distribution for each topic (beta).
Usage
plot_word_probabilities(
dfm_object,
topic_n,
max.em.its = 75,
categorical_var = NULL,
continuous_var = NULL,
top_term_n = 10,
ncol = 3,
topic_names = NULL,
height = 1200,
width = 800,
verbose = TRUE,
...
)
Arguments
- dfm_object
A quanteda document-feature matrix (dfm).
- topic_n
The number of topics to display.
- max.em.its
Maximum number of EM iterations (default: 75).
- categorical_var
An optional character string for a categorical variable in the metadata.
- continuous_var
An optional character string for a continuous variable in the metadata.
- top_term_n
The number of top terms to display for each topic (default: 10).
- ncol
The number of columns in the facet plot (default: 3).
- topic_names
An optional character vector for labeling topics. If provided, must be the same length as the number of topics.
- height
The height of the resulting Plotly plot, in pixels. Defaults to
1200
.- width
The width of the resulting Plotly plot, in pixels. Defaults to
800
.- verbose
Logical; if
TRUE
, prints progress information.- ...
Further arguments passed to
stm::searchK
.
Value
A Plotly
object showing a facet-wrapped chart of top terms for each topic,
ordered by their per-topic probability (beta). Each facet represents a topic.
Examples
if (interactive()) {
df <- TextAnalysisR::SpecialEduTech
united_tbl <- TextAnalysisR::unite_text_cols(df, listed_vars = c("title", "keyword", "abstract"))
tokens <- TextAnalysisR::preprocess_texts(united_tbl, text_field = "united_texts")
dfm_object <- quanteda::dfm(tokens)
TextAnalysisR::plot_word_probabilities(
dfm_object = dfm_object,
topic_n = 15,
max.em.its = 75,
categorical_var = "reference_type",
continuous_var = "year",
top_term_n = 10,
ncol = 3,
height = 1200,
width = 800,
verbose = TRUE)
}