Assess Hybrid Model Stability via Bootstrap

Evaluates the stability of a hybrid topic model by running bootstrap resampling. This helps identify which topics are robust and which may be artifacts of the specific sample. Based on research recommendations for topic model validation.

Usage

assess_hybrid_stability(
  texts,
  n_topics = 10,
  n_bootstrap = 5,
  sample_proportion = 0.8,
  embedding_model = "all-MiniLM-L6-v2",
  seed = 123,
  verbose = TRUE
)

Arguments

texts: A character vector of texts to analyze.
n_topics: Number of topics (default: 10).
n_bootstrap: Number of bootstrap iterations (default: 5).
sample_proportion: Proportion of documents to sample (default: 0.8).
embedding_model: Embedding model name (default: "all-MiniLM-L6-v2").
seed: Random seed for reproducibility.
verbose: Logical, if TRUE, prints progress messages.

Value

A list containing stability metrics:

topic_stability: Per-topic stability scores (0-1)
mean_stability: Overall stability score
keyword_stability: Stability of top keywords per topic
alignment_stability: Stability of STM-embedding alignment
bootstrap_results: Detailed results from each bootstrap run

Examples

if (FALSE) { # \dontrun{
  stability <- assess_hybrid_stability(
    texts = my_texts,
    n_topics = 10,
    n_bootstrap = 5,
    verbose = TRUE
  )

  # View topic stability scores
  stability$topic_stability
} # }

Usage

Arguments

Value

See also

Examples