| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by Jorge1o1 190 days ago
	Well, pretty much all of the LLMs are based on the decode-only version of the Transformer architecture (in fact it’s the T in GPT). And in the Transformer architecture you’re working with embeddings, which are exactly what this article is about, the vector representation of words.