Hacker News new | ask | show | jobs
by erlkonig 1098 days ago
[I wrote this mostly imagining the idea was about converting the entire Postgresql service to a single monolithic process. I'm not a fan so far. If is actually around coalescing like processes down to a single multithreaded process, that's more reasonable but still comes at a future cost - and whether pointful is still a question]

Converting code into multithreaded code tends to make it harder to test and debug FOREVER, as well as being more limited by default for certain system resources than a multi-process solution. Viewing and managing threads from the outside is harder, and killing a rogue thread is much more likely to crash a MT solution than killing a process in a typical resilient MP solution. Above all else, I need a database to be utterly reliable (or as close to it as possible) - including being able to back off in a mature fashion in cases of memory exhaustion (I have overcommit disabled to restore classical memory handling, i.e. malloc() can fail), and file system exhaustion. MT throws a wrench through most of the workings of a complex program, and unless some specific gain can be identified that compensates for adding complexity and fragility to virtually any change going forward, then... um... why? I read a bit "well, the other guys are doing it" handwaving:

    "Other large projects have gone through this transition.    
    It's not easy, but it's a lot easier now than it was
    10 years ago. The platform and compiler support is
    there now, all libraries have thread-safe 
    interfaces, etc."
But that isn't a functional gain. And:

    "I don't expect you or others to buy into any
    particular   code change at this point, or to
    contribute time into it. Just to accept that it's a
    worthwhile goal. If the implementation turns out to
    be a disaster, then it won't be accepted, of course.
    But I'm optimistic."
But this is NOT a worthwhile goal. Fun, perhaps. Diverting or challenging, perhaps. A disaster, quite possibly. But without identifying a goal that can only be achieved by walking into the multithreading pit, the project is a waste of time for end users. Possibly a growth experience for the experimenters, regardless of whether successful.
2 comments

There are significant costs to this but maybe some real benefits too. This post certainly doesn't sell the potential benefits well enough.

The big functional gain would be better connection handling. The current process-per-connection model has overhead and it's pretty common to see large database instances with double-digit max connection limits. Because connections are expensive and in (artificially) limited supply, application developers work around the limitations with connection pooling and/or proxy services.

Theoretically, a multi-threaded postgres could easily deal with thousands of concurrent connections - not just a performance improvement but a game changer in terms of application developer UX. When connections are cheap, the application just connects when it needs to communicate, no pgbouncer or connection pools needed.

I have no idea if the multi-threading proposal here is viable, but if it can make connections easier to manage it might be worth it.

Connection scalability was greatly improved in v14 and above.

https://www.citusdata.com/blog/2020/10/25/improving-postgres...

This feels a bit like you are using an image of absolute safety to hold hostages, not allowing the potential for change & improvement.

The author starts by citing a decent variety of sources to have already expressed interest here, who see this as progress.

Migrating a bunch of per-process global variables to have scope (per thread or per session) may be risky, but gee, it just sounds like vaguely reasonable architecture to have these days to me.

You can usually find developers interested in any fashionable approach to a problem. Change is fine. What improvement, specifically, though? Adding multithreading is not a functional improvement in and of itself, but more the opposite. MT should be used when a specific, important functional gain can be realized through no other approach.

I'm not trying to win an argument or anything here, I'm just highlighting from my and others' experiences that multithreading is a tradeoff not to be made casually. It makes some things faster, especially if not I/O bound, but it also increases dev and debug cost, and reduces the number of developers who can assist. That downside tends to permanent.

That's fair. It does seem like no one else on the planet still uses multi-process architecture, that the performance has never been there.

This was the famous evolution of Apache Httpd 1 being a forking multi-process model, and in v2 gaining a new pluggable strategy system, including thread pooled models. For great scalability wins. https://httpd.apache.org/docs/2.4/mod/worker.html

Context switching between processes is just such a taxing thing to do. So many caches to reset. Especially with all the mitigations most folks run, it's such a drain.

However, that Apache situation, multithreading *like* things together (request handling), is a more reasonable act than say, turning all of PostgreSQL into a monolithic process. PostgreSQL is a much more heterogeneous system than Apache, with potentially more interesting ways to lock up than Apache with its rather simple overall mission.

Sure it sounds interesting to try a branch of pg with, for example, just the sessions being multithreaded - but then how DOES one forcibly stomp on some session that has grabbed some critical lock without crashing other users' sessions? Killing off a session's thread inside of a MT'ed session handler without putting any other threads at risk would be the first problem (and an admin is likely to use "ps -Lef" to find the thread ID and then "kill"). Many MT programs I see lose their little minds if a thread is killed from outside.

Going too crazy with threads can also cause performance issues, since there is overhead - just less than for processes - around thread creation/switching/etc, and is why thread pools are common. There's a short article about this at:

    https://stackoverflow.com/questions/5961536/what-is-best-a-single-threaded-or-a-multi-threaded-server/5964238#5964238
There's some theory about how multithreading to handle a bunch of fds versus using poll / nonblocking I/O in a singlethreaded solution being equivalent at some level in computing science, but skill sets tend to matter more in practice.

This is a pretty good page on the options in general, though dated (anyone already know of a newer equivalent to it?):

    http://www.kegel.com/c10k.html
I feel sure work has been put into making kernel support for both MT *and* the poll and nonblocking I/O models more efficient since then. :-)
> Sure it sounds interesting to try a branch of pg with, for example, just the sessions being multithreaded - but then how DOES one forcibly stomp on some session that has grabbed some critical lock without crashing other users' sessions?

Presumably as one does now, through pg_terminate_backend()/pg_cancel_backend().

> Killing off a session's thread inside of a MT'ed session handler without putting any other threads at risk would be the first problem (and an admin is likely to use "ps -Lef" to find the thread ID and then "kill")

ps+kill already puts all of Postgres' processes at risk in Postgres' MP system, because processes that unexpectedly exit may have corrupted shared state, so in those situations PG restarts. MT would not significantly change that.

> Going too crazy with threads can also cause performance issues, since there is overhead - just less than for processes - around thread creation/switching/etc, and is why thread pools are common.

(emphasis mine)

Considering that PostgreSQL currently is a multi-process architecture, surely replacing the Process primitive with the Thread primitive will reduce the overhead of connection backends, all else being equal.