| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by cwzwarich 2205 days ago
	An even bigger flaw with most academic research into parser error recovery is that the vast majority of syntax errors occur from modifying a valid program to produce an invalid program, but the recovery algorithms are oblivious to this.

3 comments

ComputerGuru 2205 days ago

Have a look at this writeup by the author of lezer, if you haven't already: https://marijnhaverbeke.nl/blog/lezer.html

link

ltratt 2204 days ago

tree-sitter is excellent stuff! It's heavily inspired by Tim Wagner's PhD thesis (original site seems to be down, but https://web.archive.org/web/20150919164029/https://www.cs.be... works). IMHO more people should know about that work, and the sequence of work from Susan Graham's lab that led up to it. We have also been heavily inspired by Tim's work and Lukas's thesis extends and updates a number of aspects of that seminal work including, in Chapter 3, error recovery (https://diekmann.co.uk/diekmann_phd.pdf).

All that said, it's surprisingly difficult to compare error recovery in an online parser (i.e. one that's parsing as you type) to a batch parser. In the worst case (e.g. load a file with a syntax error in), online parsers have exactly the same problems as a batch parser; however, once they've built up sufficient context they have different, sometimes more powerful, options available to them (but they also need to be cautious about rewriting the tree too much as that baffles users).

link

estebank 2205 days ago

Approaching this from the opposite side, language designers should also take into account how code with slight mistakes (typos and confused use of features) could be detected. Sometimes adding small things to the grammar can pay a lot of dividends when writing the production ready compiler. Alternatively when adding a new feature they should be thinking "what common mistakes will be made to produce worse errors if we introduce this with this syntax".

link

zffr 2205 days ago

I think in some cases they do. Python3’s parser can detect when people use the old print syntax for example.

link

tux1968 2205 days ago

Which suggests that parsing should be done while editing, supporting many other refactoring tools as well. The key feature enabling this facility is incremental parsing.

link