Hacker News new | ask | show | jobs
by teacup50 3264 days ago
Libraries solve the problem of said middle end in a compiler, and in an IDE.

If you insist on using IPC, then you've incurred a great deal of friction in a place that it matters.

If you insist on using JSON IPC, then you've incurred a great deal of overhead in a place that it matters.

2 comments

This times 1000. AN API IS ALREADY A PROTOCOL but it doesn't require external processes, serialization, extra failure modes, etc, etc.

The fact that the HN community seems to have jumped aboard this idea, "yeah let's just require a server to do something simple like format text in your editor", is completely flabbergasting. People just seem to have NO IDEA how much complexity they are adding, and don't care.

Maybe in 5 years our machines will be running 10,000 processes at boot because people will want a server for every operation...

Do you know how these IDE features are currently implemented in editors like vim? Unless there is built-in support (e.g. ctags), most plug-ins that provide language-specific features do so by running an external tool, sometimes going so far as scraping the compiler output.

This means that on every single change, a new heavyweight process is created, communication happens over unspecified textual formats, and everything is likely to break with the next update because there is no stable interface.

JSON IPC with a continuously-running process using a well-specified protocol is a huge step up in comparison.

Just because the current way it's done is terrible, doesn't mean you should "upgrade" to a bad way instead of a good way.
What would a good way be? Why is having the code completion and parsing in a self-contained library a bad way? Or are you just opposed to using TCP instead of an ABI?
A separate server has also some advantages: (1) it's not going to crash your editor, and (2) it opens up the possibility of keeping the language server running outside of the editor and available to other tools.

Yes, serialization and communication has a lot of extra cost compared to a function call but consider that (1) the request rate is limited by user typing speed, which means around 10 requests per second tops, low enough that it won't matter, especially compared to running a type checker, (2) all the calls in LSP can take a very long time to complete, you won't be able to turn this protocol into blocking API calls that you do from the UI loop, because of this.

Even with a plugin you would need to run into a separate thread (or possibly threads) and cancel requests after a timeout.

That said, LSP is a terrible protocol. They chose to represent all offset in terms of UTF-16 (!) encoding units, which is truly retarded since most editors won't be reading UTF-16 files nor will they be representing them internally as UTF-16.

You can also have state and caching and such within the server so the client can afford not thinking about that as well
> They chose to represent all offset in terms of UTF-16

Wow, yeah, that's pretty terrible.

I imagine JSON serialisation overhead is pretty small when compared to parsing/typechecking a Rust program, which is probably what the Rust Language Server has to do whenever anything changes..

Not to mention that a lot of people capable of writing the tooling would struggle to export a C API. I write Scala in my day job and it would take me a while to learn how to do that - and I've done some programming in C/C++ before.

I'm a little confused as to what you propose instead.

Suppose I have vim, and the Rust compiler. I want to add RLS level of support to vim. I download some vimscript plugin, and what? Do you distribute the rust language server as a compiled plugin that you add to the address space of the editor at runtime? And if there's a bug, and it segfaults, then it takes down my entire VIM process?

It seems like there's some complexity in directly calling the code with an API too. It's actually not to bad to just open a pipe and communicate.

Maybe I'm missing something, but wrangling compiled plugins seems like it'd be a bad time.

While I do love vim, its own high level of internal implementation brokenness doesn't really have much bearing on how one implements this sort of thing in a real multi-language IDE.

And as for your question, the answer is ... yes. Sure. Why not? That's what we already do in nearly all IDEs.

I mean, if you want to build a universal system, leaving out vim and emacs is just shooting yourself in the foot.

Are there any plugins which are binary compatible between more than one IDE? That seems hard.

> I mean, if you want to build a universal system, leaving out vim and emacs is just shooting yourself in the foot.

Said no IDE user, ever.

> Are there any plugins which are binary compatible between more than one IDE? That seems hard.

No. And who cares?

An API that can be accessed from heterogeneous languages will involve IPC.

Particularly since the best API will use the compiler's symbol tables (avoiding implementing syntactic and semantic analysis twice, buggily), and compiler implementation languages are even more diverse than editor implementation languages.

> An API that can be accessed from heterogeneous languages will involve IPC.

No. If your language cannot call into a dynamic library using a well-defined C ABI for your platform, then it is already failing to speak a standard protocol. Building all kinds of crazy, complicated, slow infrastructure in order to get it to successfully speak some other protocol, is a symptom of modern-day clueless programming.

> Particularly since the best API will use the compiler's symbol tables (avoiding implementing syntactic and semantic analysis twice, buggily)

Yes, this is of course a good idea. Why one presumes this requires a separate running process, I have no idea.

> No. If your language cannot call into a dynamic library using a well-defined C ABI for your platform, then it is already failing to speak a standard protocol.

This also involves a marshalling cost at the ABI boundary, which may be lower overhead than parsing JSON, but is significantly more brittle. And it's less ergonomic for many plugin/editor authors. And it can't be spec'd with a schema that isn't just "read the headers."

>This also involves a marshalling cost at the ABI boundary

Only for some languages, and that cost should be far far less than running a separate process, shipping json over pipes, and parsing the json

> And it's less ergonomic for many plugin/editor authors

I think that many modern programmers find this more ergonomic than a C ABI is part of what he is complaining about. Let's get comfortable with what is good, rather than make what we are comfortable with?

Agreed, that pipes+json is much higher overhead, just noting that this proposed approach still isn't free.

> Let's get comfortable with what is good, rather than make what we are comfortable with?

I agree in the abstract! I just emphatically disagree with characterizing a C ABI as "what is good."

I don't understand what you are saying here.

Why is it "significantly more brittle"? It is a well-specified interface. It is less brittle than talking over a socket because the kinds of points of failure involved with sockets don't exist in this case.

> And it can't be spec'd with a schema that isn't just "read the headers."

What does that even mean? It's a protocol just like any protocol, except you get the added benefit that for many languages it can be typechecked. Why are you claiming it can't be specified or that someone has to "read the headers"? What headers?

From your endorsement of "using the compiler's symbol tables" (paraphrasing) I took you to mean that you're proposing binding directly to GCC (or another tool) as a library, relying on it's internal data structures as this C API. Based on this comment, it sounds like you're now suggesting that this API should still be standardized and require translating from the compiler's internals into some standardized AST/symbol format anyways. I still think the latter is bonkers for several reasons (SIGABRT being one), but it's significantly less bonkers than what I had thought you were proposing initially.
Not only that, but it means if the library crashes, your editor process dies. That hardly seems better than sending some text over a socket. At least if the external process crashes, your editor can just restart it.
Several people have said this. Look ... a "crash" in a modern operating system is a recoverable exception.
A library API is bound to a specific language/runtime. But every language out there can speak JSON. Language servers are mostly written in the language they are for, because that language already has the compiler APIs. The editor is often written in a different language.
> extra failure modes

Well, different failure modes, maybe. If an external process crashes, then you just have your editor restart it. But if you've linked a library into your editor, and it crashes, then your editor crashes.

I much prefer either keeping that code in a separate process, or having that code written in a memory safe language, where it won't take down your editor when something goes wrong.

Incremental recompilation isn't fast enough to wait for between keystrokes, so in-process servers would run in their own thread. Along with accounting for arbitrarily incompatible language runtimes and memory management schemes, wouldn't we be looking at badly re-implementing half of a process-and-ipc infrastructure here, just without memory protection?

Agreed on JSON, though.

What library supports C, C++, Lisp, Java, JS and C# callers?
I think any C api should be target-able by all of those.
The JetBrains IDE libraries?

Those are separate libraries with support for C, C++, Clojure, Java, Kotlin, C#, JS, PHP, Python, and more.

All of them exposing the entire AST and the entire environment, which is far more than LSP ever did.

The question was not which library could present an AST for all those languages, but which kind of library format can be consumed from all those languages. And ideally without FFI and writing complex wrappers. C libraries non very convenient to use from C#, JVM, JS (node.js) or sometimes it isn't even possible (e.g. for JS in the browser).

The question is important, because otherwise one would constrain editors to be written only in a language which is compatible to the library format.

Using C libraries from C# or the JVM is very convenient, even today. There’s even automated systems to generate the entire bindings for the JVM, I’ve written bindings myself for a few libraries. You can just generate the interface file for Java from the .h with JNAerator, import it, and you’re done.
That doesn't solve the problem of a segfault in the C library crashing the entire JVM.
It does. Because there’s also libraries to automatically spawn a separate JVM and communicate with that via an IPC system. Or even spawn other things.

But if you want a system where I have to transfer gigabytes via a JSON IPC bus every minute, sure. That’s totally not going to destroy performance in projects that are several millions of code long with major auto-generated assets.

The language server protocol is useless for larger interwoven projects. The same issue appears already with JetBrains Rider (a C# IDE where the C# parser is implemented in a separate process)