Hacker News new | ask | show | jobs
by gunnarmorling 1716 days ago
Nice write-up! Two questions:

- Can you share details on the routing? I.e. how does the app know which database + schema it needs to go to for given workspace?

- Did you consider using several databases on the same Postgres host (instead of schemas within a single database)? Not sure what's better really, curious whether you have any thoughts about it.

Thanks!

2 comments

> Can you share details on the routing?

All in the application layer! All of our server code runs from the same repo, and every Postgres query gets routed through the same module. This means that it was relatively easy to add a required "shard key" argument to all of our existing queries, and then within our Postgres module consult an in-app mapping between shard key range and DB+schema.

Plumbing that shard key argument through the application was more difficult, but luckily possible due to the hierarchical nature[0] of our data model.

> Did you consider using several databases on the same Postgres host

If I recall correctly, you cannot use a single client connection to connect to multiple databases on the same host, and so this could have ballooned our connection counts across the application. This is not something we explored too deeply though, would love to hear about potential benefits of splitting tables in this way.

[0] https://www.notion.so/blog/data-model-behind-notion

Ah yes, good point about connections being DB-specific. The schema approach seems more light-weight in that regard. Thanks!
Followup question: does the sharding happen within the app that talks to the DB, or do you shard traffic before it hits them? In the former case the total number of DB connections required presumably grows something like n^2.