| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by akarshc 144 days ago

While building AI features that rely on real-time streaming responses, I kept running into failures that were hard to reason about once things went async.

Requests would partially stream, providers would throttle or fail mid-stream, and retry logic ended up scattered across background jobs, webhooks, and request handlers.

I built ModelRiver as a thin API layer that sits between an app and AI providers and centralizes streaming, retries, failover, and request-level debugging in one place.

It’s early and opinionated, and there are tradeoffs. Happy to answer technical questions or hear how others are handling streaming reliability in production AI apps.