Y
Hacker News
new
|
ask
|
show
|
jobs
by
robmay
5 days ago
Definitely - when you consider how varied inference workloads will be, and the different ways to minimize costs - better prompting, SLMs, different chips, batching, etc, there will be tons of opportunity