Main function to process PDF files - extracts text content using pdftools.
For table extraction, use process_pdf_file_py.
Value
List with:
data: Data frame with extracted content
type: Character string indicating content type ("text" or "error")
success: Logical indicating success
message: Character string with status message
Details
This function extracts text content from PDFs using pdftools package. Works best with text-based PDFs (not scanned images).
For PDFs containing tables or complex layouts, use the Python-based
process_pdf_file_py which provides better table extraction.
