|
|
|
|
|
by mbell
1712 days ago
|
|
Notion seems like an interesting data storage problem. The vast majority of the data is in the `block` entity, which are organized into a sort of nested set, written/updated individually (user edits one at a time) but read in ranged chunks (a doc). Off hand this seems like an almost worst case for PG. Since the updates to blocks could contain large data (causing them to be moved often) and there is one big table; it seems likely that the blocks for a single notion document will end up being non-continuous on disk and thus require a lot of IO/memory trashing to read them back out. PG doesn't have a way to tell it how to organize data on disk so there is no good way around this (CLUSTER doesn't count, it's unusable in most use cases). Arm chair engineering of course - but my first thought would be to find another storage system for blocks that better fits the use case and leave the rest in PG. This does introduce other problems, but it just feels like storing data like this in PG is a bad fit. Maybe storing an entire doc's worth of block entities in a jsonb column would avoid a lot of this? |
|