|
|
|
|
|
by AaronNewcomer
1054 days ago
|
|
I’ve been doing something similar recently that impressed me. I have been taking handwritten manuscripts from the early 1800s and feeding them into AWS Textract and then feeding the raw OCR data results into Claude2 or GPT4 to have to it make sense of the horrible OCR from the handwriting. I was even more impressed feeding it handwritten French documents like patents from the same time period. AWS Textract only works with English so even with it’s ML OCR of trying to make English words from the French handwriting, it was still workable when telling the LLM I was feeding it French text that was OCR’d even though it all kind seemed like gibberish when looking at it. |
|