|
|
|
|
|
by estimator7292
119 days ago
|
|
I've always wondered about that. LLM providers could easily decimate the cost of inference if they got the models to just stop emitting so much hot air. I don't understand why OpenAI wants to pay 3x the cost to generate a response when two thirds of those tokens are meaningless noise. |
|
They basically only started doing this because someone noticed you got better performance from the early models by straight up writing "think step by step" in your prompt.