| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cl42 1162 days ago
	Those hacks are literally how a large language model using a transformer architecture to predict the next token in a sequence works. They take advantage of how a function choosing a token with maximal probability of appearing works.