Hacker News new | ask | show | jobs
by bilsbie 1200 days ago
What’s the rough idea of how this is possible? I thought you need the parrelism of a gpu
1 comments

inference has less pressure of parallelism compared to training