Hacker News new | ask | show | jobs
by simonw 322 days ago
I'll believe they are doing that when one of the models draws me an SVG that actually looks like a pelican.
1 comments

Someone needs to craft a beautifully bike donned by a pelican, throw in some seo, and see how long it takes a model to replicate it.

Simon probably wouldn't be happy about killing his multi-year evaluation metric though...

I would be delighted.

My pelican on a bicycle benchmark is a long con. The goal is to finally get a good SVG of a pelican riding a bicycle, and if I can trick AI labs into investing significant effort in cheating on my benchmark then fine, that gets me my pelican!