Hacker News new | ask | show | jobs
by simonw 195 days ago
I wish I knew why. I didn't think it would be a useful indicator of model skills at all when I started doing it, but over time the pattern has held that performance on pelican riding a bicycle is a good indicator of performance on other tasks.