| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by nighthawk454 949 days ago
	I think the analogy is something like: if you have a simple distribution over all words, then that's just word frequency. Obviously not a good predictor. The 'information' necessary to predict the correct next word contextually is just not there if you're predicting words in a vacuum. In order to be practically useful and predict the right words _in context_, the model must be conditioning off of more of the sentence/document (aka more information). So it should not be surprising that a 'glorified autocomplete' has some degree of "understanding" as it would be impossible for it to be any good as an autocomplete-er otherwise.

1 comments

theGnuMe 949 days ago

That's not information theoretic, that's just conditional probability.

link

tysam_and 941 days ago

You might want to take another look at Shannon's paper, lol, this statement is quite contradictory. Probability _is_ the backbone of information theory, dude! It's quite incredible.

link

nighthawk454 948 days ago

it is conditional probability, but that is a fundamental concept used in information theory

link