Y
Hacker News
new
|
ask
|
show
|
jobs
by
embedding-shape
60 days ago
Yes, that's why I'm asking you what exactly 4 3090s get in prompt-processing and generation, sorry if I was unclear.
1 comments
mips_avatar
60 days ago
Maxes out around 4K tok/s output. Each pair of 3090s has its own instance of the model with parallelism across the nvlink bridge. Though nvlink is only 2x over pcie5
link