Hacker News new | ask | show | jobs
by mnk47 257 days ago
In my experience, the model's performance in silly tasks like these is usually (not always) correlated with its performance in other areas except tool use/agent stuff.