| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by constantinum 509 days ago
	For instace Llamaparse(https://docs.llamaindex.ai/en/stable/llama_cloud/llama_parse...)uses LLMs for pdf text extraction, but the problem is hallucination. e.g > https://github.com/run-llama/llama_parse/issues/420 There is also LLMWhisperer that preserves the layout(tables, checkboxes, forms)and hence the context. https://pg.llmwhisperer.unstract.com/

1 comments

Is this open source? Is it slow Python? That's where I'm stuck.

This is not open-source. It has high accuracy and it is faster too. All you need is to point your documents to the API.