|
|
|
|
|
by Thaxll
467 days ago
|
|
Do you benchmark the right thing though? It seems to focus a lot on image / charts etc... The 95% from their benchmark: "we evaluate them on our internal “text-only” test-set containing various publication papers, and PDFs from the web; below:" Text only. |
|