| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bjourne 2270 days ago
	That's an enormous topic and an enormous can of worms. Modern spell checkers all use statistical methods meaning that they are trained on a corpus. That allows them to understand that the sequence of tokens [what, i, would, like to, get into, is, :] is much more probable than [get, what i, would, like, into, is, :]. I.e the latter is grammatically incorrect. A good start is to learn about Markov models. For more sophisticated stuff, investigate word vectors and language modeling using recurrent neural networks. The Python library NLTK comes with a free book which can teach you the basics.