Converts tokens to their lemmatized forms using spaCy, with batch processing to handle large document collections without timeout issues.
Details
Uses spaCy for linguistic lemmatization producing proper dictionary forms (e.g., "studies" -> "study", "better" -> "good"). Batch processing prevents timeout errors with large document collections.
See also
Other preprocessing:
get_available_dfm(),
get_available_tokens(),
import_files(),
prep_texts(),
process_pdf_unified(),
unite_cols()
