Searches for the optimal number of topics (K) using stm::searchK. Produces diagnostic plots to help select the best K value.
Usage
find_optimal_k(
dfm_object,
topic_range,
max.em.its = 75,
emtol = 1e-04,
cores = 1,
categorical_var = NULL,
continuous_var = NULL,
height = 600,
width = 800,
verbose = TRUE,
...
)Arguments
- dfm_object
A quanteda dfm object to be used for topic modeling.
- topic_range
A vector of K values to test (e.g., 2:10).
- max.em.its
Maximum number of EM iterations (default: 75).
- emtol
Convergence tolerance for EM algorithm (default: 1e-04). Higher values (e.g., 1e-03) speed up fitting but may reduce precision.
- cores
Number of CPU cores to use for parallel processing (default: 1). Set to higher values for faster searchK on multi-core systems.
- categorical_var
Optional categorical variable(s) for prevalence.
- continuous_var
Optional continuous variable(s) for prevalence.
- height
Plot height in pixels (default: 600).
- width
Plot width in pixels (default: 800).
- verbose
Logical indicating whether to print progress (default: TRUE).
- ...
Additional arguments passed to stm::searchK.
See also
Other topic-modeling:
analyze_semantic_evolution(),
assess_embedding_stability(),
assess_hybrid_stability(),
auto_tune_embedding_topics(),
calculate_assignment_consistency(),
calculate_eval_metrics_internal(),
calculate_keyword_stability(),
calculate_semantic_drift(),
calculate_topic_probability(),
calculate_topic_stability(),
find_topic_matches(),
fit_embedding_model(),
fit_hybrid_model(),
fit_temporal_model(),
generate_topic_labels(),
get_topic_prevalence(),
get_topic_terms(),
get_topic_texts(),
identify_topic_trends(),
plot_model_comparison(),
plot_quality_metrics(),
run_contrastive_topics_internal(),
run_neural_topics_internal(),
run_temporal_topics_internal(),
validate_semantic_coherence()
