Hacker News new | ask | show | jobs
by pavpanchekha 140 days ago
Frontier models are now much bigger than an individual query, hence batching, MoE, etc. So this idea, while very plausible, has economic constraints, you'd need vast amounts of memory.