Hacker News new | ask | show | jobs
by PaulHoule 613 days ago
My guess is the problem is words with high probabilities that happen to be part of a wrong answer.

For one thing the probability of a word occurring is just a probability of the word occurring in a certain sample, it's not an indicator of truth. (e.g. the most problematic concept in philosophy in that just introducing it undermines the truth, see "9/11 truther") It's also not sufficient to pick a "true" word or always pick a "true" word but rather the truthfulness of a statement needs to be evaluated based on the statement as a whole.

A word might have a low probability because it competes with a large number of alternatives that are equally likely which is not a reason to stop generation.