| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by samsullivan 315 days ago
	answering correctly is completely dependent on the attention blocks to somehow capture the single letter nuance given word tokenization constraints. does the attention block in kimi have a more receptive architecture to this?