Hacker News new | ask | show | jobs
by breadislove 488 days ago
The systems they tested against the LLMs are mostly used as a part of a larger system. A more fair comparison would be to use something like MinerU [1] and proper benchmark like the OHR Bench [2] and Reductos table bench [3]. This paper is really bad...

[1]: https://github.com/opendatalab/MinerU [2]: https://github.com/opendatalab/OHR-Bench [3]: https://github.com/reductoai/rd-tablebench