Hacker News new | ask | show | jobs
by eqvinox 92 days ago
This doesn't really have anything to do with async signal safety. You could perfectly fine capture the Ctrl+C in psql, stick a notifier byte in a pipe (or use signalfd to begin with) and handle it as a synchronous event in a main loop. You'd still need to establish a new connection purely to bypass buffered data. (or use TCP URG, but that seems generally a poor idea.)
3 comments

Well, it does, because -- as the article notes -- psql creates and sends on the new connection inside the signal handler, and that doing the pipe-write thing instead (required since their TLS library is presumably not async signal safe) would require a major refactor of the code.

Likely psql doesn't even have a "main loop"; I expect it just blocks on recv() until it gets a response from the server. And on Linux, I think it will automatically restart/resume syscalls that were in progress when a signal fires, so you can't even rely on EINTR to get you out of that recv() so you could check a global flag that you could set in the signal handler.

Although, reading the sigaction() manpage, if you don't specify SA_RESTART, it shouldn't do this? (If they are using signal() and not sigaction(), it might always restart?) But still, not sure why they don't take that route. I imagine it would require much less of a refactor to set a global flag, and then always check it after a recv() fails with EINTR.

Sure, the "right" thing to do is have a global pipe, and instead of blocking in recv(), poll() on it with both the connection socket and the read end of the pipe. And I bet that would require a bit of a refactor. But a global flag is somewhere in the middle...

But who knows; I've never read their source code, so I expect they know what they're talking about when they say it's not a trivial fix.

> This doesn't really have anything to do with async signal safety.

TLS not being async signal safe is explicitly called out on the article as the reason the token is sent in clear text.

> Handle it as a synchronous event in a main loop

Of course of you rearchitect the client there are better solutions. But again, the article mentions that's not planned for now.

By comparison, delegating cancellation to a background background thread can be done non-intrusively. In principe no code outside the cancel path need changing.

Edit: the article mentions that there is a refactor in the works to implement cancel over tls [1]. Turns out that they decided to use a thread (with a pipe for signaling).

[1] https://www.postgresql.org/message-id/flat/DEY0N7FS8NCU.1F7Q...

> By comparison, delegating cancellation to a background background thread can be done non-intrusively. In principe no code outside the cancel path need changing.

pthread_create() isn't async signal safe, though, so they can't simply move their socket code for the cancellation into another function and call pthread_create() on it. They still have to get the main thread to stop doing what its doing (usually via the pipe trick) in order to create the thread, which could easily be a big refactor.

> Edit: the article mentions that there is a refactor in the works to implement cancel over tls [1]. Turns out that they decided to use a thread (with a pipe for signaling).

Seems odd to me to bother. If you have to do the pipe thing, why not just do the new connection for cancellation in the main thread once it sees the data on the pipe? I guess that way they can return control of the CLI to the user while they cancel in the background, rather than blocking the user while the cancellation is going on. But as a user, I kinda would like to know that the query I just cancelled actually got cancelled, a property that the old code has, but the new code won't.

(Presumably the new code can print a warning if cancellation fails, but it could take a long time to fail, and in the meantime the user has moved on.)

Of course you don't spawn a thread from the signal handler. You start it first thing in main and park it waiting for a wakeup.
Ah duh, hadn't thought of that. Same pipe trick, but with another thread blocking on the read end.

That is a much simpler change than refactoring the main thread to poll on several FDs instead of just blocking in recv().

I believe the suggestion is to have a TLS endpoint in the server, which demultiplexes the incoming CancelRequest and signals to the corresponding worker process via shared memory
The problem isn't on the server; the server already knows how to cancel things, and already supports cancellation over TLS. It's just that psql doesn't use it, due to the need for a refactor to make that work. Other psql-like frontends do already use it, as the article points out.
ah, true, thanks! (unfortunately I can't delete/edit the comment.)