| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by GaggiX 225 days ago
	≥In general encoder+decoder models are much more efficient at infererence than decoder-only models because they run over the entire input all at once (which leverages parallel compute more effectively). Decoder-only models also do this, the only difference is that they use a masked attention.