|
|
|
|
|
by celerity
2979 days ago
|
|
The article doesn't mention it explicitly, but this is a nice example of how using Bayes theorem helps you ignore the hard-to-compute normalization term of the input space. In the article, this is the P(w) term of P(c|w) = P(c)P(w|c)/P(w), where c is a correction, and w is the original word. The author does implicitly talk about this when he explains that P(c|w) conflates the two factors, but it's also not that hard to see that getting a handle on P(w) -- the probability space of misspellings -- is harder than getting a hold of P(c) -- the probability space of actual words, and Bayes lets us get rid of the former during optimization. |
|