| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by RyanHamilton 4 hours ago
	Could it be possible, these firms are optimizing for two things: a) Better performance. b) Gathering data from you to further improve performance later. I've also found the huge amount of planning rather than iteration frustrating. I've felt like I'm teaching a junior!

2 comments

epolanski 4 hours ago

I think they simply optimize around E2E benchmarks, none of those benchmarks is designed as multi turn assistance to the user, but going from a prompt straight to the final solution.

link

happyPersonR 2 hours ago

more thinking == more tokens === more money LOLL

link

drob518 22 minutes ago

I think they are optimizing for one-shot performance because that will drive usage. They can’t afford to look bad in the benchmarks. And if that means consuming an order of magnitude more tokens, well, that’s good for business, too.

link

overfeed 48 minutes ago

Os there a cost benchmark out there? I wonder how frontier models are doing over time for cost per problem solved.

link