|
|
|
|
|
by george_ciobanu
49 days ago
|
|
the proxy uses a cheap, small model (like gpt-5.4-mini by default) behind the scenes to save tokens on the expensive main model. Because the proxy has a little bit of overhead per turn, the break-even point depends entirely on session length. Short sessions (e.g., 2 rounds): The proxy's overhead might actually cost you more than you save. Long sessions (e.g., 69 to 190 rounds): The token savings on the main model are massive and completely dwarf the small model's overhead. It's not a universal win for quick, one-off queries, but the math becomes highly favorable on long, complex debugging sessions. |
|