| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by borzunov 1307 days ago
	clarification: You can also use offloading on Colab, but inference with offloading is at least 10x slower (see other comment threads). So it can't really be used for interactive inference, but may be used for fine-tuning with large batches/sequence lengths.