Y
Hacker News
new
|
ask
|
show
|
jobs
by
ivalm
2170 days ago
But batch size is prob least problem since you can do data parallelism (send half batch to each gpu, combine on cpu).
I think only model bigger than gpu mem is where you really wish for nvlink on v100s.