| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ttt3ts 903 days ago
	You have to pass the context between GPUs for large models that don't fit in VRAM. Often ends up slower. Also, tooling around AMD GPUs is still poor in comparison.