Hacker News new | ask | show | jobs
by demosthanos 374 days ago
To be fair, once it does generalize the pattern then the benchmark is actually measuring something useful for deciding if the model will be able to product a subject-verb-object SVG.