Evaluates the stability of a hybrid topic model by running bootstrap resampling. This helps identify which topics are robust and which may be artifacts of the specific sample. Based on research recommendations for topic model validation.
Usage
assess_hybrid_stability(
texts,
n_topics = 10,
n_bootstrap = 5,
sample_proportion = 0.8,
embedding_model = "all-MiniLM-L6-v2",
seed = 123,
verbose = TRUE
)Arguments
- texts
A character vector of texts to analyze.
- n_topics
Number of topics (default: 10).
- n_bootstrap
Number of bootstrap iterations (default: 5).
- sample_proportion
Proportion of documents to sample (default: 0.8).
- embedding_model
Embedding model name (default: "all-MiniLM-L6-v2").
- seed
Random seed for reproducibility.
- verbose
Logical, if TRUE, prints progress messages.
Value
A list containing stability metrics:
topic_stability: Per-topic stability scores (0-1)
mean_stability: Overall stability score
keyword_stability: Stability of top keywords per topic
alignment_stability: Stability of STM-embedding alignment
bootstrap_results: Detailed results from each bootstrap run
See also
Other topic-modeling:
analyze_semantic_evolution(),
calculate_assignment_consistency(),
calculate_eval_metrics_internal(),
calculate_keyword_stability(),
calculate_semantic_drift(),
calculate_topic_probability(),
calculate_topic_stability(),
find_optimal_k(),
find_topic_matches(),
fit_embedding_topics(),
fit_hybrid_model(),
fit_temporal_model(),
generate_topic_labels(),
get_topic_prevalence(),
get_topic_terms(),
get_topic_texts(),
identify_topic_trends(),
plot_model_comparison(),
plot_quality_metrics(),
run_contrastive_topics_internal(),
run_neural_topics_internal(),
run_temporal_topics_internal(),
validate_semantic_coherence()
