| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ofirpress 1398 days ago
	Cool new efficient inference method that saves 2x memory and does not degrade performance for large language models! More from the author about this at: https://twitter.com/Tim_Dettmers/status/1559892888326049792