| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by didgeoridoo 1019 days ago
	Really interesting! I wonder how well this syncs up with human intuition and general “information density”. If it’s a close match, maybe you could use this as a tool to help with skimming documents — the red (“hard to predict”) areas might be a good hint to slow down and read more carefully, while the green (“easy to predict”) areas might mean you could skim without losing too much unpredictable information.

2 comments

tgv 1019 days ago

If you're set on reading the document as fast as you can, you will skip the "green" bits after having done it a few times. A likely word such as "not" will not stand out. You'd be better off asking a more comprehensive language model for a summary.

link

thesephist 1019 days ago

This is definitely an interesting idea I've also pondered before. In my experience (just speaking from intuition) what's "easy" for LMs to predict often doesn't line up with our human expectations for what's "obvious". Often LLMs will learn seemingly "low information content" statistical correlations that just helps it lower its training loss.

link