Performs comprehensive semantic analysis including similarity, dimensionality reduction, and clustering. This is a high-level wrapper function.
Usage
fit_semantic_model(
texts,
analysis_types = c("similarity", "dimensionality_reduction", "clustering"),
document_feature_type = "embeddings",
similarity_method = "cosine",
use_embeddings = TRUE,
embedding_model = "all-MiniLM-L6-v2",
dimred_method = "UMAP",
clustering_method = "umap_dbscan",
n_components = 2,
n_clusters = 5,
seed = 123,
verbose = TRUE
)Arguments
- texts
A character vector of texts to analyze.
- analysis_types
Types of analysis to perform: "similarity", "dimensionality_reduction", "clustering".
- document_feature_type
Feature extraction type (default: "embeddings").
- similarity_method
Similarity calculation method (default: "cosine").
- use_embeddings
Logical, use embedding-based approaches (default: TRUE).
- embedding_model
Sentence transformer model name (default: "all-MiniLM-L6-v2").
- dimred_method
Dimensionality reduction method: "PCA", "t-SNE", "UMAP" (default: "UMAP").
- clustering_method
Clustering method: "kmeans", "hierarchical", "umap_dbscan" (default: "umap_dbscan").
- n_components
Number of dimensions for reduction (default: 2).
- n_clusters
Number of clusters (default: 5).
- seed
Random seed for reproducibility (default: 123).
- verbose
Logical, if TRUE, prints progress messages.
See also
Other semantic:
analyze_document_clustering(),
analyze_similarity_gaps(),
calculate_clustering_metrics(),
calculate_cross_similarity(),
calculate_document_similarity(),
calculate_similarity_robust(),
cluster_embeddings(),
cross_analysis_validation(),
export_document_clustering(),
extract_cross_category_similarities(),
generate_cluster_labels(),
generate_cluster_labels_auto(),
generate_embeddings(),
reduce_dimensions(),
semantic_document_clustering(),
semantic_similarity_analysis(),
temporal_semantic_analysis(),
validate_cross_models(),
word_co_occurrence_network(),
word_correlation_network()
Examples
if (interactive()) {
data(SpecialEduTech)
texts <- SpecialEduTech$abstract[1:10]
results <- fit_semantic_model(
texts = texts,
analysis_types = c("similarity", "clustering")
)
print(results)
}
