Hacker News new | ask | show | jobs
by fulafel 1317 days ago
This is written in C. I wonder how common it is to write PG extensions in safer langugaes and what would be the most suitable.

I'm somewhat wary of using nontrivial C extensions, having seen so many of them sometimes seg fault the backend (eg PostGIS). There seem to be PG backend crashes described in this projects issues as well.

4 comments

PG has a special memory manage rule, named MemoryContext. All memory allocated in a context will disappear when it leaves that context. this means that you can safely not free memory, or your memory will be freed in unexpected places. this is a big conflict with the way rust manages memory. write extension in rust won't improve it much.

And in PG, there is a special method to create a process, creating threads is not possible because the logging system makes heavy use of setjmp().

> creating threads is not possible because the logging system makes heavy use of setjmp().

Naive question from a non-c user, setjmp/longjmp just manipulate the stack and since each thread has its own execution stack, that should be completely safe ISTM - so why is it unsafe/impossible? I'm missing something.

The general issue is that it is not designed for multi threaded environment and there is bunch of global and frequently accessed state. One could probably comb through the entire codebase and make all that state thread-local, but the benefit probably does not outweight the amount of work.
There's https://github.com/tcdi/pgx for writing extensions in rust.
Postgres itself is written in C. I suppose its every internal interface is in C. I wonder how many unsafe sections would an extension written in Rust have to have to use these interfaces.

I wish something like Lua + LuaJIT could be used to write such extensions; at least it's memory-safe. OTOH mapping these C interfaces to Lua structures, and making them work with GC may happen to be non-trivial.

PG ships with Lua support: https://www.postgresql.org/docs/current/external-pl.html

(Also Python, Javascript, and Java)

I don't know specifics about the API coverage. It seems this extension mostly just implements new SQL visible functions and data types, which should be doable from those languages as well. Composite types might have to be defined as PG records (or json) instead of C level new PG object types.

You can write stored procedures in it. Which is fine.

You cannot though write a new storage engine, a new kind of index, or something else that takes an extension.

I think very few PG extension use cases need new storage engines. Looking at https://age.incubator.apache.org/overview/ it says it's using the normal PG storage.

Regarding indexes - (1) you can implement most custom indexe scenarios based on supplied expressions involving custom functions: https://www.postgresql.org/docs/current/indexes-expressional... (2) I suspect for the graph use case you could leverage json indexes: https://www.postgresql.org/docs/current/datatype-json.html#J...

This is not to claim with any high degree of certainty that this could be implemented in PG/Lua, just from my armchair there doesn't seem anything immediately obvious that would prevent it.

For example the querying in AGE is syntactically implemented in the function cypher() that is used in the examples to receiving custom syntax as strings with the "dollar quoting"[1] syntax:

  SELECT *
   FROM cypher('graph_name', $$
     MATCH (n)
     WHERE exists(n.surname)
     RETURN n.first_name, n.last_name $$) as (first_name agtype, last_name agtype);
[1] https://www.postgresql.org/docs/current/sql-syntax-lexical.h...
Timescale wrote their Promscale extension for Postgres in Rust. So it exists.