|
|
|
|
|
by borzunov
1260 days ago
|
|
clarification: You can also use offloading on Colab, but inference with offloading is at least 10x slower (see other comment threads). So it can't really be used for interactive inference, but may be used for fine-tuning with large batches/sequence lengths. |
|