|
|
|
|
|
by jsnell
432 days ago
|
|
> This implies it's not a hybrid model that can just skip reasoning steps if requested. It clearly is, since most of the post is dedicated to the tunability (both manual and automatic) of the reasoning budget. I don't know what they're doing with this pricing, and the blog post does not do a good job explaining. Could it be that they're not counting thinking tokens as output tokens (since you don't get access to the full thinking trace anyway), and this is the basically amortizing the thinking tokens spend over the actual output tokens? Doesn't make sense either, because then the user has no incentive to use anything except 0/max thinking budgets. |
|