| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by omeze 780 days ago
	This is a really cool paper, reminds me of the simple exercise Karpathy goes through in his NN vid series with a bigram predictor. Looks like in practice there’s still some grounding issues when attempting to use them for instruction-tuned applications, but clever direction to explore!