Skip to contents

Computes lexical dispersion data for specified terms across a corpus. Shows where terms appear within each document, useful for understanding term distribution patterns.

Usage

calculate_lexical_dispersion(
  tokens_object,
  terms,
  scale = c("relative", "absolute")
)

Arguments

tokens_object

A quanteda tokens object

terms

Character vector of terms to analyze

scale

Character, "relative" (0-1 normalized) or "absolute" (token position)

Value

Data frame with columns:

  • doc_id: Document identifier

  • term: The search term

  • position: Position in document (relative or absolute)

  • doc_length: Total tokens in document

Examples

if (FALSE) { # \dontrun{
library(quanteda)
toks <- tokens(c("The cat sat on the mat", "The dog ran in the park"))
dispersion <- calculate_lexical_dispersion(toks, c("the", "cat", "dog"))
} # }