| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by zuzun 1173 days ago
	If I understand it correctly, you are only attending preceding tokens in your paper. Can the constant bias matrix be made symmetric for unmasked tasks?