|
|
|
|
|
by fnbr
743 days ago
|
|
The rule of thumb is roughly 44gb, as most models are trained in bf16, and require 16 bits per parameter, so 2 bytes. You need a bit more for activations, so maybe 50GB? you need enough RAM and HBM (GPU RAM) so it’s a constraint on both. |
|