Hacker News new | ask | show | jobs
by onion2k 3679 days ago
If you had 100Mb of data I strongly suspect it'd return too many false positives - the likelihood of a string being in there would be too high. A spellchecker isn't very useful if it knows "Donald Trumo" is a real name.

For reference, the hunspell dictionary file used in apps like Libre Office and Firefox is about 400kb. That's effectively the whole of the English language.