|
|
|
|
|
by charcircuit
73 days ago
|
|
You are making the false assumption that all token consumption costs the same when it doesn't. Yes in the limit the price to serve the model and generate a response is O(tokens), but when tokens is smaller it can be cheaper to generate a new token than when tokens is bigger. If other harnesses prompt with more tokens than Claude Code it can be more expensive to serve. |
|