| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by overfeed 103 days ago
	> I wonder what the underlying cause is It responds with the statistically most probable text based on its training data, which happens to be different with the errors vs without. I suspect high-fidelity diagramming requires a different attention architecture from the common ones used in sentence-optimized models.