Y
Hacker News
new
|
ask
|
show
|
jobs
by
boroboro4
545 days ago
They discuss it in the paper and recommend 32 GPUs (H800 in their case) for prefill stage and 320 GPUs for decoding.
=)