Hacker News new | ask | show | jobs
by tgv 2363 days ago
> We need a lot of data to do this well.

Yup. And once you've got a sufficiently large tagged error corpus, spelling correction will be as simple as a lookup: almost all misspellings (in terms of frequency) will be present in the corpus, and you can drop the (rather simplistic) algorithmic part.

2 comments

But what about “they’re”, “there”, and “their”? And other homonyms people frequently misspell?
By way of word2vec, or ?