Hacker News new | ask | show | jobs
by crooked-v 1124 days ago
At its core it's a pretty simple multi-tenancy arrangement. Think something like this:

    tenants (id, updated_at)
    tenants_users (id, updated_at, tenant_id, user_id)
    products (id, updated_at, name, tenant_id)
    product_variants (id, updated_at, product_id, name)
One of the tenants views a page that does a simple `SELECT * FROM products ORDER BY updated_at LIMIT 100`. The RLS checks have to reference `products` -> `tenants` -> `tenant_users`, but because of how Postgres does it, every row in products will be checked no matter what you do. (Putting a WHERE clause on the initial query to limit based on tenant or user is pointless, because it'll do the RLS checks before the WHERE clause is applied.) Joins in RLS policies are awful for performance, so your best bet is an IN clause with the cached subquery function, in which case it's still then got the overhead of getting the big blob of IDs and then checking it against every row in `products`.
1 comments

Do you have an index on `updated_at` ?
Yes. That's also irrelevant to the cause of the performance issues, which all happen before the ORDER BY and LIMIT even come into the picture in Postgres' query optimization.

Edit: To give a better idea of the impact of RLS here, writing up an equivalent query outside of the RLS context [1] has an under-1-second response time, where RLS turns that into 10x the time even in the most optimized case.

[1]: This kind of thing, roughly:

    SELECT *
    FROM products
    JOIN tenants ON products.tenant_id = tenants.id
    JOIN tenants_users ON tenants.id = tenants_users.tenant_id
    WHERE tenant_users.user_id = auth.uid()
    ORDER BY updated_at
    LIMIT 100