| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by AaronNewcomer 1054 days ago
	I’ve been doing something similar recently that impressed me. I have been taking handwritten manuscripts from the early 1800s and feeding them into AWS Textract and then feeding the raw OCR data results into Claude2 or GPT4 to have to it make sense of the horrible OCR from the handwriting. I was even more impressed feeding it handwritten French documents like patents from the same time period. AWS Textract only works with English so even with it’s ML OCR of trying to make English words from the French handwriting, it was still workable when telling the LLM I was feeding it French text that was OCR’d even though it all kind seemed like gibberish when looking at it.

1 comments

eigenvalue 1054 days ago

Cool. I'm sure that would result in better results, but then you have to pay per request and I'm sure it could get pretty expensive if you're talking about long documents. It's nice not having to think about the price of anything and also having full control over how it all works.

link