|
|
|
|
|
by danenania
321 days ago
|
|
The problem is that input token cost dominates output token cost for the majority of tasks. Once you've given the model your prompt and are reading the first output token for classification, you've already paid most of the cost of just prompting it directly. That said, there could definitely be exceptions for short prompts where output costs dominate input costs. But these aren't usually the interesting use cases. |
|