Hacker News new | ask | show | jobs
by pests 190 days ago
It's more than just the model weights. During inference there would be a lot of cross-talk as each node broadcasts its results and gathers up what it needs from the others for the next step.