| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by snyhlxde 769 days ago
	A bit more intuition on how training works: in natural language processing, some phrases/collocations, for example "remind ... of ...", "make a decision", "learn a skill" etc. are used together. We can ask LLMs to learn such collections & frequently appearing n-grams. After learning, the model can use parallel decoding to predict many tokens that are frequently appear together in one forward pass.