| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by pierre 108 days ago

Build a benchmark to evaluate how good document parser work on a dataset of 2000 PDFs manually annotated, trying to evaluate accross multiple dimensions: charts, tables, text styling, text correctness, and attribution.

The benchmark evaluate performance on full page (not selected part of the pages), and evaluaye different OSS / crobtier model / commercial approach.

For transparency it is available as a HF leaderbaord.

Paper: https://arxiv.org/abs/2604.08538