Concatenates top terms for each topic into text strings suitable for embedding generation. Useful for creating topic representations for semantic similarity analysis.
Usage
get_topic_texts(
top_terms_df,
topic_var = "topic",
term_var = "term",
weight_var = NULL,
sep = " ",
top_n = NULL
)Arguments
- top_terms_df
A data frame containing top terms for topics, typically output from
get_topic_terms.- topic_var
Name of the column containing topic identifiers (default: "topic").
- term_var
Name of the column containing terms (default: "term").
- weight_var
Optional name of column with term weights (e.g., "beta"). If provided, terms are ordered by weight before concatenation.
- sep
Separator between terms (default: " ").
- top_n
Optional number of top terms to include per topic (default: NULL, uses all).
See also
Other topic-modeling:
analyze_semantic_evolution(),
assess_hybrid_stability(),
calculate_assignment_consistency(),
calculate_eval_metrics_internal(),
calculate_keyword_stability(),
calculate_semantic_drift(),
calculate_topic_probability(),
calculate_topic_stability(),
find_optimal_k(),
find_topic_matches(),
fit_embedding_topics(),
fit_hybrid_model(),
fit_temporal_model(),
generate_topic_labels(),
get_topic_prevalence(),
get_topic_terms(),
identify_topic_trends(),
plot_model_comparison(),
plot_quality_metrics(),
run_contrastive_topics_internal(),
run_neural_topics_internal(),
run_temporal_topics_internal(),
validate_semantic_coherence()
Examples
if (FALSE) { # \dontrun{
# Get topic terms from STM model
top_terms <- TextAnalysisR::get_topic_terms(stm_model, top_term_n = 10)
# Convert to text strings for embedding
topic_texts <- get_topic_texts(top_terms)
# Generate embeddings
topic_embeddings <- TextAnalysisR::generate_embeddings(topic_texts)
} # }
