Y
Hacker News
new
|
ask
|
show
|
jobs
by
DesaiAshu
85 days ago
data bandwidth limits distributed training under current architectures. really interesting implications if we can make progress on that
2 comments
dogcomplex
85 days ago
Limits but doesn't prohibit. See
https://www.primeintellect.ai/blog/intellect-3
- still useful and can scale enormously. Takes a particular shape and relies heavily on RL, but still big.
link
andoando
84 days ago
What bandwith limits? Im assuming the forward and backward passes have to be done sequentially?
link
DesaiAshu
83 days ago
Yes also passing data within each layer
link