Hacker News new | ask | show | jobs
by throwaw33333434 940 days ago
How did it make it to read pdfs? I tried to create a tool that would extract tables from pdf and covert into CSV. https://chat.openai.com/g/g-e3jhasATL-pdf-ninja but it fails.

It works a bit better if I extract the string in python and do some clean up before sending

import fitz # PyMuPDF

pdf_document = fitz.open("foo.pdf")

page_number = 1

page = pdf_document.load_page(page_number - 1)

text = page.get_text("text")

response = client.chat.completions.create( model="gpt-3.5-turbo", messages=[ { "role": "system", "content": f""" ..... {text} .... """

1 comments

It's done via the internal ChatGPT PDF parsing. The 'Code interpreter' feature is off, so the chat assistant is NOT writing any code. It simply follows the instructions given in the (quite voluminous) prompt .