Hacker News new | ask | show | jobs
by NiloCK 27 days ago
This is wonderful.

Having models attempt an SVG letter S remains one of my personal/informal LLM benchmarks. They are still pretty bad at it.