| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by HappMacDonald 598 days ago
	I wonder what would happen if token input included the logprob (or n/a for input from outside the LLM) of each token selected and the network were trained with that extra layer of information, especially during the human feedback training at the end.