Hacker News new | ask | show | jobs
by infecto 692 days ago
No multi-modal model is ready for that in reality. The accuracy from other tools to extract tables and text are far superior.