Y
Hacker News
new
|
ask
|
show
|
jobs
by
yawnxyz
27 days ago
self-hosting is fine, but even if you had a $100k god box with opus-level LLMs, you'd still end up grinding it to a halt if you tried running 5-10 parallel inference streams