Show HN: Tokio-prompt-orchestrator – LLM pipeline orchestration in Rust

Y	Hacker News new \| ask \| show \| jobs

Show HN: Tokio-prompt-orchestrator – LLM pipeline orchestration in Rust (github.com)

2 points by Shmungus 111 days ago

I built this after getting frustrated with "multi-agent" frameworks that claim parallelism but are really just one fat async task with no resource bounds.

tokio-prompt-orchestrator breaks LLM inference into 5 physical stages (RAG → Assemble → Inference → Post-Process → Stream), each running in its own Tokio task with bounded channels between them. When a stage falls behind, backpressure builds locally instead of blowing up the whole pipeline. Some things that might be interesting to folks here:

Circuit breakers per provider (OpenAI, Anthropic, local llama.cpp) so one failing API doesn't cascade Request deduplication that saved 60-80% on inference costs in my testing Prometheus metrics + a TUI dashboard for watching the pipeline in real time MCP server integration so you can use it as a Claude Desktop tool

It's 58k lines of Rust, MIT licensed, no unsafe. Been running it in production for my own projects for a few months now. Would love feedback on the channel sizing heuristics and the retry/backoff strategy, those were the hardest parts to get right. Happy to answer questions about the architecture.

GitHub: https://github.com/Mattbusel/tokio-prompt-orchestrator