Hacker News new | ask | show | jobs
by shaicoleman 1426 days ago
We're currently using the schema-per-tenant, and it's working very well for us:

* No extra operational overhead, it's just one database

* Allows to delete a single schema, useful for GDPR compliance

* Allows to easily backup/restore a single schema

* Easier to view and reason about the data from an admin point of view

* An issue in a single tenant doesn't affect other tenants

* Downtime for maintenance is shorter (e.g. database migration, non-concurrent REINDEX, VACUUM FULL, etc.)

* Less chance of deadlocks, locking for updates, etc.

* Allows easier testing and development by subsetting tenants data

* Smaller indexes, more efficient joins, faster table scans, more optimal query plans, etc. With row level security, every index needs to be a compound index

* Easy path to sharding per tenant if needed. Just move some schemas to a different DB

* Allows to have shared data and per-tenant data on the same database. That doesn't work with the tenant-per-database approach

There are a few cons, but they are pretty minor compared to the alternative approaches:

* A bit more code to deal in the tenancy, migrations, etc. We opted to write our own code rather than use an existing solution

* A bit more hassle when dealing with PostgreSQL extensions . It's best to install extensions into a separate extensions schema

* Possible caching bugs so you need to namespace the cache, and clear the query cache when switching tenant

* The security guarantees of per tenant solution aren't perfect, so you need to ensure you have no SQL injection vulnerabilities

2 comments

We ran a multi-tenant SaaS product for years w/ a schema-per-tenant approach. For the most part it all worked pretty great.

We ran into issues here and there but always found a way to work around them:

* Incremental backups were a pain because of needing to lock so many objects (# of schemas X # of tables per schema).

* The extra code to deal w/ migrations was kinda messy (as you mentioned).

* Globally unique IDs become the combination of the row ID + the tenant ID, etc...

For us though the real deal-breaker turned out to be that we wanted to have real foreign keys pointing to individual rows in tenant schemas from outside of the tenant schema and we couldn't. No way to fix that one since with multi-schema the "tenant" relies on DB metadata (the schema name).

We ended up migrating the whole app to RLS (which itself was a pretty interesting journey). We were afraid of performance issues since the multi-schema approach kinda gives you partitioning for free, but with the index usage on the RLS constraints we've had great performance (at least for our use case!).

After quite a bit of time working with both multi-schema & RLS I probably wouldn't go back to multi-schema unless I had a real compelling reason to do so due to the added complexity. I really liked the multi-schema approach, and I think most of the critiques of it I found were relatively easy to work around, but RLS has been a lot simpler for us.

Would you like to explain the deal-breaker a little bit more? I do not understand the limitation you hit. It seems like it is no problem to access foreign keys in a different schema as long permissions do allow that. Thanks!
There are some other cons:

Memory usage and I/O can be less efficient. Postgres handles table data in 8kb pages, so even if you're just reading a single row, that reads 8kb from disk and puts 8kb in the Postgres buffer cache, with that row and whatever happens to be next to it in the physical layout of the underlying table. Postgres does this because of locality of reference: it's cheaper to bulk-load data from disk, and, statistically speaking, you may need the adjacent data soon. If each user is touching separate tables, you're loading a page per row for each user, and you're missing out on some of the locality benefits.

Another problem is monitoring (disclosure: I work for pganalyze, which offers a Postgres monitoring service). The pg_stat_statements extension can track execution stats of all normalized queries in your database, and that's a very useful tool to find and address performance problems. But whereas queries like "SELECT * FROM posts WHERE user_id = 123" and "SELECT * FROM posts WHERE user_id = 345" normalize to the same thing, schema-qualified queries like "SELECT * FROM user_123.posts" and "SELECT * FROM user_345.posts" normalize to different things, so you cannot easily consider their performance in aggregate (not to mention bloating pg_stat_statements by tracking so many distinct query stats). This is the case even when you're using search_path so that your schema is not explicitly in your query text.

Also, performance of tools like pg_dump is not great with a ton of database objects (tables and schemas) and, e.g., you can run into max_locks_per_transaction [1] limits, and changing that requires a server restart.

I wouldn't say you should never do schema-based multi-tenancy (you point out some good advantages above), but I'd be extremely skeptical of using it in situations where you expect to have a lot of users.

[1]: https://www.postgresql.org/docs/current/runtime-config-locks...