Y
Hacker News
new
|
ask
|
show
|
jobs
by
ioedward
1189 days ago
Normally people split up the model across multiple GPUs, i.e. model/tensor parallelism.