Skip to contents

Given a document-feature matrix (dfm), this function computes the most frequent terms and creates a ggplot-based visualization of term frequencies.

Usage

plot_word_frequency(dfm_object, n = 20, ...)

Arguments

dfm_object

A quanteda dfm object.

n

The number of top terms to display (default: 20).

...

Further arguments passed to quanteda.textstats::textstat_frequency.

Value

A ggplot object visualizing the top terms by their frequency. The plot shows each term on one axis and frequency on the other, with points representing their observed frequencies.

Examples

if (interactive()) {
  df <- TextAnalysisR::SpecialEduTech
  united_tbl <- TextAnalysisR::unite_text_cols(df, listed_vars = c("title", "keyword", "abstract"))
  tokens <- TextAnalysisR::preprocess_texts(united_tbl, text_field = "united_texts")
  dfm_object <- quanteda::dfm(tokens)
  word_frequency_plot <- TextAnalysisR::plot_word_frequency(dfm_object, n = 20)
  word_frequency_plot
}