Hacker News new | ask | show | jobs
by robmay 5 days ago
Definitely - when you consider how varied inference workloads will be, and the different ways to minimize costs - better prompting, SLMs, different chips, batching, etc, there will be tons of opportunity