Hacker News new | ask | show | jobs
by buruk 3695 days ago
Prior knowledge solves that problem. A human encounters the same sparsity a computer does when learning from text but prior knowledge allows us to connect rare features to a larger model in which they are, in a way, less rare.

If you think about it, there is an iteration happening within machine learning that is essentially building that prior knowledge about the world by reusing previous models as inputs to knew ones. For example how Spacy uses word2vec vectors to do parsing and NER and then sense2vec uses Spacy pos tags create word vectors.

sense2vec.spacy.io

1 comments

>> Prior knowledge solves that problem.

Prior knowledge _might_ solve that problem. It's not really solved yet so who knows. Yeah, work is ongoing and word vectors sound cool and all, but in the past people said the same thing about bag-of-words models and look where we are now.

Humans solve sparsity, sure, we learn language from ridiculously few data points, but who knows what it is that we do, exactly? If we knew, we wouldn't be discussing this.

Let's restate the problem to make sure we're talking about the same thing: the problem is that the number of possible utterances in a given language that are grammatically correct according to some grammar of that language is infinite (or so big as for it to take longer than our current universe has to live before an utterance is repeated).

And it's a problem because it's impossible to count infinity given only finite time. I don't see how prior knowledge, or anything else, can solve this.

Which must mean humans do something else entirely, and all our efforts that are based on the assumption that you can do some clever search and avoid having to face infinity, are misguided and doomed to fail.