Hacker News new | ask | show | jobs
by falcolas 3254 days ago
Depending on your language, proper syntax highlighting (without parsing the entire program) is nearly impossible.

For example, in C, what highlight category do you give to '*c'? A declaration, a dereference, or a multiplication call?

In Lisp, is the first element of a list a macro or a function (or a value)? If there's a reader macro, it gets even harder.

2 comments

Syntax highlighting doesn't necessarily have to work at the lowest granularity, though; for some uses, merely distinguishing between comments and non-comments is acceptable, and that's still 'syntax highlighting'. Of course, as you point out, true 100% syntax highlighting needs to fully parse the entire program; why not do that, though? I guess it would be too computationally expensive for certain sizes of program, but it could still update in near-realtime, no?
Syntax highlighting doesn't need to precisely classify every single character according to how the language would parse it. So with your `*c` example, I wouldn't actually expect a syntax highlighter to highlight that at all.

But every classification the highlighter does do must be accurate, or it's a buggy highlighter.

And FWIW, it's certainly possible to write a syntax highlighter that does parse the whole program. You'd normally find this in an IDE rather than a programmer's text editor. For example, writing Swift in Xcode, everything gets precisely highlighted, to the point where references to real types are highlighted whereas references to unknown types (e.g. typos) aren't. It's not practical to do this outside of IDEs, which is why most syntax highlighting only tries to highlight that which it can unambiguously determine.