| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by floobertoober 1109 days ago
	> Of course, large language models are (by definition) currently going the other direction ... How so? Aren't the networks' weights orders of magnitude smaller than the training data?

3 comments

govg 1109 days ago

I interpreted that statement as saying the current practice is to make LLMs larger and larger (so they effectively memorize more and more data) to make them more powerful, but from the perspective of information theory, if models were powerful and "understanding", then models could stay the same size and become more and more powerful as they get increasingly better at compressing the available information. I am not sure if this interpretation was what was meant though.

link

karpierz 1109 days ago

I believe the parent poster's point is: LLMs are more effective when they use more memory, meaning the less they are forced to compress the training data, the better they perform.

link

MauranKilom 1105 days ago

But they don't losslessly recreate the training data.

link