| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by casercaramel144 890 days ago
	It's camel. How do you do matrix vector attention without keeping the full matrix in cache, surely you don't just load unload it a million times