That’s just wrong. File reads, searches, compiler output, are the top input token consumers in my workflow. None of them can be removed. And they are the majority of my input tokens. That’s also why labs are trying to make 1M input work, and why compaction is so important to get right.
Regarding output - yes, but that wasn’t the topic in this thread. It’s just easier to argue with input tokens that price has gone up. I have a hunch the price for output will go up similarly, but can’t prove it. The jury’s out IMO: https://news.ycombinator.com/item?id=47816960
This has no bearing on my comment. The point is that a better model avoids dozens of prompts and tool calls by making fewer CORRECT tool calls, with the user needing no more prompts.
I’m surprised this is even a question; obviously a better prompter has the same properties and it’s not in dispute?
Though, from my limited testing, the new model is far more token hungry overall