Hacker News new | ask | show | jobs
by pierre 59 days ago
Build a benchmark to evaluate how good document parser work on a dataset of 2000 PDFs manually annotated, trying to evaluate accross multiple dimensions: charts, tables, text styling, text correctness, and attribution.

The benchmark evaluate performance on full page (not selected part of the pages), and evaluaye different OSS / crobtier model / commercial approach.

For transparency it is available as a HF leaderbaord.

Paper: https://arxiv.org/abs/2604.08538