Hacker News new | ask | show | jobs
by sergiogdr 848 days ago
How does TS compare to using semantic highlighting in LSP?
4 comments

A large number of LSP servers and related software use tree-sitter as a backend.
That's not possible, as Treesitter doesn't do for example type-checking (unification). It can be used as the parser to generate the AST with which you're actually doing the work.
Agree, though not sure that's what the parent meant. I think they meant:

1. Some LSP implementations use Tree Sitter as the parser in their implementation

2. But it's only part of the implementation. It just generates the parse tree, on which various other LSP features are built, e.g. enhanced syntax highlighting, go to definition, ...

Tree Sitter is pretty good for (1) because it was built specifically to cope with code that is (a) changing rapidly and (b) is, by default, not valid. So it has good error reporting and recovery.

It's slightly worse but a lot faster and better at dealing with spelling/syntax errors. Also, the LSP can render inline error messages and line markers on top of TS highlighting so not much information is lost by just using TS highlighting everywhere.
Having tried both with Neovim, I ended up going with just LSPs instead of using tree-sitter.
With Treesitter you have another parser which is redundant at best and inconsistent with the LSP at worst.
Not sure I understand your point.

LSP is a protocol and tree-sitter is a parser generator. They're kind of orthogonal concepts; a tree-sitter parser couldn't ever be used directly in place of an LSP server, but an LSP server may well make use of tree-sitter as a first step for extracting information from the code and keeping it in sync. If it doesn't it'll have to come up with some other way of parsing the code in any case, so I don't see how it could be said to be redundant or inconsistent.

Of course, tree-sitter's thing is how universal it is. There's parsers for tons of languages, and you can work with them all using the same API, though you're on your own for attributing semantic meaning. Most popular languages have language-specific tools (e.g. `libcst`) which are usually more powerful for that specific language, so they'd probably be better starting points for building a language-specific LSP server which I imagine is the common case.

> Not sure I understand your point.

The problem is using Treesitter (for syntax highlighting and "semantic movements") and an LSP at the same time. So if your language has a LSP, using Treesitter additionally is redundant at best and introduces inconcistency at worst.

I'm not talking about using Treesitter as the parser for the LSP.

> Most popular languages have language-specific tools

I'd say even less popular langauges like Coq^H^H^HRocq, Lean 4, Koka, Idris, Unison, ... have their "own" tools, I do not know of a language that uses a Treesitter parser in its LSP, but I do know about tools like https://semgrep.dev/ (written in OCaml) and Github's code search which use Treesitter.

You're forgetting that treesitter is much, much faster than LSPs.