Y
Hacker News
new
|
ask
|
show
|
jobs
by
zozbot234
145 days ago
Prompt processing/prefill can even get some speedup from local NPU use most likely: when you're ultimately limited by thermal/power limit throttling, having more efficient compute available means more headroom.