| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by MacsHeadroom 915 days ago
	That's a great question and I would like to know too. It looks like the answer is substantially faster than an equally sized Transformer, and the end result will score better than a Transformer on basically every benchmark. Also it will do inference 3-5x faster in half the RAM.