Hacker News new | ask | show | jobs
by zerojames 682 days ago
I have seen excellent performance with Florence-2 for OCR. I wrote https://blog.roboflow.com/florence-2-ocr/ that shows a few examples.

Florence-2 is < 2GB so it fits into RAM well, and it is MIT licensed!

On a T4 in Colab, you can run inference in < 1s per image.

3 comments

This looks good, I will investigate integrating it into my project. Thanks!
I couldn't find any comparisons with Microsoft's TrOCR model. I guess they are for different purposes. But since you used Florence-2 for OCR, did you compare the two?
This is pretty cool, when checking how Microsoft models (then) stacked against Donut, I chose Donut, didn't know they published more models!