Update: Just ran our benchmark on the Mistral model and results are.. surprisingly bad?
Mistral OCR:
- 72.2% accuracy
- $1/1000 pages
- 5.42s / page
Which is pretty far cry from the 95% accuracy they were advertising from their private benchmark. The biggest thing I noticed is how it skips anything it classifies as an image/figure. So charts, infographics, some tables, etc. all get lifted out and returned as [image](image_002). Compared to the other VLMs that are able to interpret those images into a text representation.