|
|
|
|
|
by p1esk
1440 days ago
|
|
The problem with nccl is it reports combined bandwidth: nvlink (intranode) and network. I want to see the network traffic, for example to identify a network link bottleneck when changing model or pipeline parallelism configuration. p.s. if you solve this I’ll become a paying customer. |
|