Hacker News new | ask | show | jobs
by didgeoridoo 1019 days ago
Really interesting! I wonder how well this syncs up with human intuition and general “information density”. If it’s a close match, maybe you could use this as a tool to help with skimming documents — the red (“hard to predict”) areas might be a good hint to slow down and read more carefully, while the green (“easy to predict”) areas might mean you could skim without losing too much unpredictable information.
2 comments

If you're set on reading the document as fast as you can, you will skip the "green" bits after having done it a few times. A likely word such as "not" will not stand out. You'd be better off asking a more comprehensive language model for a summary.
This is definitely an interesting idea I've also pondered before. In my experience (just speaking from intuition) what's "easy" for LMs to predict often doesn't line up with our human expectations for what's "obvious". Often LLMs will learn seemingly "low information content" statistical correlations that just helps it lower its training loss.