Creates an interactive heatmap visualization of document similarity matrices with support for document metadata, feature-specific colorscales, and rich tooltips. Supports both symmetric (all-vs-all) and cross-category comparison modes.
Usage
plot_similarity_heatmap(
similarity_matrix,
docs_data = NULL,
feature_type = "words",
method_name = "Cosine",
title = NULL,
category_filter = NULL,
doc_id_var = NULL,
colorscale = NULL,
height = 600,
width = NULL,
row_category = NULL,
col_categories = NULL,
category_var = "category_display",
show_values = FALSE,
facet = NULL,
row_label = NULL,
output_type = "plotly"
)Arguments
- similarity_matrix
A square numeric matrix of similarity scores
- docs_data
Optional data frame with document metadata containing:
document_number: Document identifiers for axis labelsdocument_id_display: Document IDs for hover textcategory_display: Category labels for hover text
- feature_type
Feature space type: "words", "topics", "ngrams", or "embeddings" (determines colorscale and display name)
- method_name
Similarity method name for display (default: "Cosine")
- title
Plot title (default: NULL, auto-generated from feature_type)
- category_filter
Optional category filter label for title (default: NULL)
- doc_id_var
Name of document ID variable (affects label text, default: NULL)
- colorscale
Plotly colorscale override (default: NULL, uses feature_type default)
- height
Plot height in pixels (default: 600)
- width
Plot width in pixels (default: NULL for auto)
- row_category
Category for row documents in cross-category mode (default: NULL)
- col_categories
Character vector of categories for column documents (default: NULL)
- category_var
Name of category variable in docs_data (default: "category_display")
- show_values
Logical; show similarity values as text on tiles (default: FALSE)
- facet
Logical; facet by column categories (default: TRUE when col_categories specified)
- row_label
Label for row axis (default: NULL, uses row_category)
- output_type
Output type: "plotly" or "ggplot" (default: "plotly", auto-switches to "ggplot" for faceting)
See also
Other visualization:
apply_standard_plotly_layout(),
create_empty_plot_message(),
create_message_table(),
create_standard_ggplot_theme(),
get_dt_options(),
get_plotly_hover_config(),
get_sentiment_color(),
get_sentiment_colors(),
plot_cluster_terms(),
plot_cross_category_heatmap(),
plot_entity_frequencies(),
plot_mwe_frequency(),
plot_ngram_frequency(),
plot_pos_frequencies(),
plot_semantic_viz(),
plot_term_trends_continuous(),
plot_word_frequency()
Examples
if (FALSE) { # \dontrun{
# Simple usage with matrix only
sim_matrix <- matrix(runif(25), nrow = 5)
plot_similarity_heatmap(sim_matrix)
# With document metadata
docs <- data.frame(
document_number = paste("Doc", 1:5),
document_id_display = c("Paper A", "Paper B", "Paper C", "Paper D", "Paper E"),
category_display = c("Science", "Science", "Tech", "Tech", "Health")
)
plot_similarity_heatmap(sim_matrix, docs_data = docs, feature_type = "embeddings")
# Cross-category comparison with faceting
plot_similarity_heatmap(
sim_matrix,
docs_data = docs,
row_category = "Science",
col_categories = c("Tech", "Health"),
show_values = TRUE,
facet = TRUE
)
} # }
