Hacker News new | ask | show | jobs
by omrimaya 108 days ago
The "infrastructure instead of a library" pattern is everywhere in the LLM tooling space right now. We went through the same decision point, stood up LiteLLM, ran it for a week, then ripped it out because the operational surface wasn't worth it for something that's really just a retry loop with state. The cooldown tiers make sense; the one thing I'd watch is key exhaustion behavior when all profiles are in backoff simultaneously, does it block or surface an error immediately?