Hacker News new | ask | show | jobs
by KrugerDunnings 1174 days ago
There is just so much, I've thought of writing some blog post about it but there is already a lot of content out there because pg is a big community with a lot of people doing interesting things and I don't know if what I know is unique and not just parroting others peoples stuff.

But ok sure I can give you some pointers to cool stuff. Sometimes it are just architectural patterns that are easy to do like upsert statements in sql using `on conflict`, or writing a task queue on the cheap using `for update skip locked limit 1`. Replace 90% of your crud api controllers with `postgrest` and use row level security for everything, AI is not going to steal your job category theory is. From a operations and scaling perspective it pays dividend to learn about `explain analyse` but did you know they also have `stats` that can help the analyser optimise queries based on statistical relations of different columns. I've also been looking into how to "branch" a database instead if just doing a backup, it does require some ZFS tricks but that is also just pure power and not something you find with a cloud file system. There are also just a ton of extension for timeseries or vector embeddings, or write your own in Rust using `pgx`. Like I said way too much to write here, I don't use all of this in production but I just keep finding these gems while working on my own project. There is a strong OSS ecosystem with lots of teams giving there own spin on pg and that is welcomed if you have a opinion of your own and still want to learn from others. You need motivation to go this deep but in contrast to other esoteric knowledge I have mastered companies are also willing to pay for it.

4 comments

for what it's worth, what you describe is the architecture philosophy of supabase (disclosure: i'm the ceo)

supabase is essentially a Postgres database with PostgREST on top, and we recommend pushing down a lot of the logic and security into the database. We took this philosophy with our pg_graphql extension (which uses pgx) and it is faster than other graphql implementations, simply it's co-located with your data, solving the n+1 problem.

pl_rust just reached 1.0, and it is now a "trusted language" so you can expect to see it arriving on a few cloud providers soon. We are releasing something this week with the RDS team which will make it easier to write key parts of your application code in trusted languages. There are certainly trade-offs, and I don't know if _everything_ should be in the database. But in data-intensive cases it makes a lot of sense.

What's the security model in PostgREST? I'm imagining it is called from your backend as a convenience vs. having a database connection library, so not typically exposed to public users of a website?
It's usually exposed to public users. The security model is mostly based on two things:

- JWT is used to authenticate API requests. The JWT contains a `role` claim which is a PostgreSQL role that is then used for the duration of the request. This role is subject to regular PostgreSQL security, be it table, column or row-level security[1].

- You expose a subset of your database objects for your API schema. This schema consists of views and functions(or only functions) to hide internal details to API users[2].

[1]: https://postgrest.org/en/stable/auth.html#authentication-seq...

[2]: https://postgrest.org/en/stable/schema_structure.html#schema...

Yeah a number of pieces of software people use after a 5 minute tutorial are like this, a ton of depth that we frequently recreate crappy versions of within our apps.

The Apache webserver (and most other webservers) is another example like Postgres where it has a ton of useful features no one uses.

Some of these less popular features don't work at massive scale but in my experience they work great for smaller teams where adding a new tech to the stack is more painful.

> I've also been looking into how to "branch" a database instead if just doing a backup, it does require some ZFS tricks but that is also just pure power and not something you find with a cloud file system.

In the cloud you can just snapshot/clone the underlying EBS volume and you’ll get a “branch” of any database on any file system, right?

Assuming no-one is using the database yes. Also not trivial to set up and use plus vm startup is somewhat slow.

If your want a full featured product (in the postgres space) that does it then look at neon.tech (no affiliation).

Can you explain what you mean by "how to branch a database" and the ZFS trick?
When I read this is thought it was a cool idea. I suspect it involves using ZFS’s file deduplication/copy-on-write mechanism to maintain a lightweight copy of a database on disk. Memory will still be required for holding the indexes etc, but in theory minimal extra disk space should be needed.

ZFS can also perform compression, which could be a big win as long as you have the spare cycles for lz4 (which is normally the case but may be an issue if you’re maxing out your NVMe disks’ bandwidth, say 3GiB/s+)