Y
Hacker News
new
|
ask
|
show
|
jobs
DatBench fixes VLM evals: 70% blindly solvable, 42% mislabeled, 35% prod gap
(
datologyai.com
)
5 points
by
hurrycane
165 days ago