Y
Hacker News
new
|
ask
|
show
|
jobs
by
entropicdrifter
942 days ago
And potentially for distributing a model across several devices at inference time. You could devote a cluster of smaller/weaker machines to inference.
1 comments
sroussey
942 days ago
You can do that today, the only advantage today though is being able to fix the model in memory. It’s sequential and slower due to communication costs, though batching might be faster?
link