Convert Topic Terms to Text Strings

Concatenates top terms for each topic into text strings suitable for embedding generation. Useful for creating topic representations for semantic similarity analysis.

Usage

get_topic_texts(
  top_terms_df,
  topic_var = "topic",
  term_var = "term",
  weight_var = NULL,
  sep = " ",
  top_n = NULL
)

Arguments

top_terms_df: A data frame containing top terms for topics, typically output from get_topic_terms.
topic_var: Name of the column containing topic identifiers (default: "topic").
term_var: Name of the column containing terms (default: "term").
weight_var: Optional name of column with term weights (e.g., "beta"). If provided, terms are ordered by weight before concatenation.
sep: Separator between terms (default: " ").
top_n: Optional number of top terms to include per topic (default: NULL, uses all).

Value

A character vector of topic text strings, one per topic, ordered by topic number.

Examples

if (FALSE) { # \dontrun{
# Get topic terms from STM model
top_terms <- TextAnalysisR::get_topic_terms(stm_model, top_term_n = 10)

# Convert to text strings for embedding
topic_texts <- get_topic_texts(top_terms)

# Generate embeddings
topic_embeddings <- TextAnalysisR::generate_embeddings(topic_texts)
} # }

Usage

Arguments

Value

See also

Examples