Y
Hacker News
new
|
ask
|
show
|
jobs
by
rahimnathwani
513 days ago
If you place experts in different GPUs
Right, this is described in the Deepseek V3 paper (section 3.4 on pages 18-20).