| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by visarga 793 days ago
	This shows the power of synthetic content - 3.3 trillion tokens! This approach can make a model even smaller and more efficient than organic text training, and it will not be able to regurgitate NYT articles because it hasn't seen any of them. This is how copyright infringement claims can be placated.