|
|
|
|
|
by michaelt
362 days ago
|
|
Sometimes. I just fed the huggingface demo an image containing some rather improbable details [1] and it OCRed "Page 1000000000000" with one extra trailing zero. Honestly I was expecting the opposite - a repetition penalty to kick in having repeated zero too many times, resulting in too few zeros - but apparently not. So you might want to steer clear of this model if your document has a trillion pages. Other than that, it did a solid job - I've certainly seen worse attempts to OCR a table. [1] https://imgur.com/a/8rJeHf8 |
|