|
|
|
|
|
by fallmonkey
301 days ago
|
|
The estimation for output token is too low since one reasoning-enabled response can burn through thousands of output tokens. Also low for input tokens since in actual use there're many context (memory, agents.md, rules, etc) included nowadays. |
|
What reasoning affects is the ratio of input to output tokens, and since input tokens are cheaper, that may well affect the economics in the end.