|
|
|
|
|
by omrimaya
108 days ago
|
|
The "infrastructure instead of a library" pattern is everywhere in the LLM tooling space right now. We went through the same decision point, stood up LiteLLM, ran it for a week, then ripped it out because the operational surface wasn't worth it for something that's really just a retry loop with state. The cooldown tiers make sense; the one thing I'd watch is key exhaustion behavior when all profiles are in backoff simultaneously, does it block or surface an error immediately? |
|