Hacker News new | ask | show | jobs
by constantinum 509 days ago
For instace Llamaparse(https://docs.llamaindex.ai/en/stable/llama_cloud/llama_parse...)uses LLMs for pdf text extraction, but the problem is hallucination. e.g > https://github.com/run-llama/llama_parse/issues/420

There is also LLMWhisperer that preserves the layout(tables, checkboxes, forms)and hence the context. https://pg.llmwhisperer.unstract.com/

1 comments

Is this open source? Is it slow Python? That's where I'm stuck.
This is not open-source. It has high accuracy and it is faster too. All you need is to point your documents to the API.