Y
Hacker News
new
|
ask
|
show
|
jobs
by
mnk47
257 days ago
In my experience, the model's performance in silly tasks like these is usually (not always) correlated with its performance in other areas except tool use/agent stuff.