Skip to contents

Alias for analyze_similarity_gaps. Identifies unique items, missing content, and cross-category opportunities based on similarity thresholds.

Usage

analyze_contrastive_similarity(
  similarity_data,
  ref_var = "ref_id",
  other_var = "other_id",
  similarity_var = "similarity",
  category_var = "other_category",
  ref_label_var = NULL,
  other_label_var = NULL,
  unique_threshold = 0.6,
  cross_policy_min = 0.6,
  cross_policy_max = 0.8
)

Arguments

similarity_data

A data frame with cross-category similarities, containing:

ref_var

Reference item identifier

other_var

Comparison item identifier

similarity_var

Similarity score

category_var

Category of comparison item

ref_var

Name of column with reference item IDs (default: "ref_id").

other_var

Name of column with comparison item IDs (default: "other_id").

similarity_var

Name of column with similarity values (default: "similarity").

category_var

Name of column with category information (default: "other_category").

ref_label_var

Optional column with reference item labels (for output).

other_label_var

Optional column with comparison item labels (for output).

unique_threshold

Threshold below which reference items are considered unique (default: 0.6).

cross_policy_min

Minimum similarity for cross-policy opportunities (default: 0.6).

cross_policy_max

Maximum similarity for cross-policy opportunities (default: 0.8).

Value

A list containing unique_items, missing_items, cross_policy, and summary_stats.