Hacker News new | ask | show | jobs
by RyanHamilton 4 hours ago
Could it be possible, these firms are optimizing for two things: a) Better performance. b) Gathering data from you to further improve performance later. I've also found the huge amount of planning rather than iteration frustrating. I've felt like I'm teaching a junior!
2 comments

I think they simply optimize around E2E benchmarks, none of those benchmarks is designed as multi turn assistance to the user, but going from a prompt straight to the final solution.
more thinking == more tokens === more money LOLL
I think they are optimizing for one-shot performance because that will drive usage. They can’t afford to look bad in the benchmarks. And if that means consuming an order of magnitude more tokens, well, that’s good for business, too.
Os there a cost benchmark out there? I wonder how frontier models are doing over time for cost per problem solved.