Unified PDF processing:
Multimodal (R-native pdftools + Vision LLM) if enabled
R pdftools text extraction as fallback
Usage
process_pdf_unified(
file_path,
use_multimodal = FALSE,
vision_provider = "ollama",
vision_model = NULL,
api_key = NULL,
describe_images = TRUE
)See also
Other preprocessing:
get_available_dfm(),
get_available_tokens(),
import_files(),
lemmatize_tokens(),
prep_texts(),
unite_cols()
