| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by ethbr1 357 days ago
	Wouldn't a model that can recite training data verbatim be larger than necessary? Exact text isn't coming from nowhere, no matter how efficiently the bits are encoded, and the same effectiveness should be achievable by compressing those portions of the model.

1 comments

zeven7 357 days ago

Maybe we are all just LLMs. If the books were written by a language producing algorithm in a human mind, maybe there’s not as much raw data there as it seems, and the total information can in fact be stored in a surprisingly small set of weights.

link

ethbr1 357 days ago

I imagine it's not inconceivable that at very high dimensions and with the right architectures stochastic compression can be unexpectedly efficient. It would be strange if the end result of AI research is realizing we're solving a compression problem (and that our brains do too).

link