Hi all
I’m new to the OpenAI API. I’ve written a (backoffice) application which uploads documents (mainly pdf) to OpenAI to extract data.
All works perfectly, but i’m struggling with scanned pdf’s. What the best practice?
- I can do OCR before sending the file to ChatGPT and make a searchable pdf
- I can make an image of each pdf page and upload those using the vision API calls. I tried this but the chat then asks to upload the document, so I guess im doing something wrong here.
- I can extract the text from the pdf and send it to the API instead of the file, but i’m worried about the results if i do that. (Position of the data,…)
- I read making a html file of the pdf does the trick. Anyone van verify?
Additional questions:
- anyone knows how the data extraction works on secured pdf files? Like the security which makes you can’t extract a page for example.
- whats the best model to use? I’m now using gpt4o-mini and results are fine. But i’ve read gpt4o is cheaper for the vision calls?
Alot of questions. Hopefully alot of answers too I have read alot of it but the API has changed a lot recently it seems so it’s hard to find the right answers online. Community to the rescue?
Thanks!