Hacker News new | ask | show | jobs
by RyanGoosling 1854 days ago
Yes, you're wrong. You're wrong because you need to JOIN a massive tree of blocks, to form the graph the author is referring to.

You can break out the "block" model into several tables and represent it in a relational database that way.

NoSQL = NO JOIN?

Hope that helps.

3 comments

I don’t actually see a graph represented anywhere in the article; the author references wanting a graph at the start, but the only thing I’m seeing described are trees of nested blocks. Even the properties list seems to be a grab-bag of KV pairs that gets permanently attached to a block once initialized, to support roundtripping

Which is pretty much the ideal scenario for a document store. The article describes Notion as being very strictly hierarchal

A block has many properties. A property has a name, and a value.

The underlying persisted data doesn't necessarily have to be a bag of KV pairs.

A block is related to its parent and descendant blocks.

These relations are suitably represented in a relational database, not a document store.

EDIT: In graph theory, a tree is an undirected, connected and acyclic graph.

A document store is basically optimized for specifically hierarchal data situations — a tree. The data structure you’re describing, and what the article describes, is precisely that: a tree.

When comparing a document store versus a RDBMS, in terms of suitability and appropriateness, the distinction is primarily along the lines of a tree, versus an arbitrary graph (by which I mean that an RDBMS is more powerful, and more general, but not inherently as optimal in either performance, “scalability”, or UX in the places where a document store makes sense.

More specifically, the way the article describes it, you’re not interested in “give me every block of type X” — you’re only interested in “given block Y, what type is it?”.

That is, the question is one-way, and fits cleanly in a hierarchal format of a document store.

The only question posed that operates in the reverse direction is permissions, though even that’s a little odd, since it seems to me it should only go “downwards” as well — a block’s permission scope is the sum of all of its parents, and you can store it there upon iteration.

> The underlying persisted data doesn't necessarily have to be a bag of KV pairs.

It doesn’t have to be... but it can be, and appears to be.

> A block is related to its parent and descendant blocks.

Right; the singular parent, and the multiple children. A tree.

> In graph theory, a tree is an undirected, connected and acyclic graph.

When discussing trees and graphs, I think it’s obvious a distinction is being made between a graph forming a tree, and graph forming a not-tree (more complex than a tree). When I say that a square is easier to encode than a rectangle, I do not mean that a square is not a rectangle, but that a rectangle is not a square — that a square’s more specific properties give us opportunity to simplify/optimize (I only need to store one length to represent it).

A database can encode a tree just fine, but that doesn’t mean it’s the best tool to do so.

There are other properties to a document store I don’t care for, and I don’t like them in general (like the implicit schema, and total lack of data consistency validation by the data store, and the fact that you often don’t truly have a tree), but representing a tree is what’s been described, and it’s exactly what they’re specialized for.

If you want to argue against it, you need to specify why you think this isn’t a tree, because I feel it’s quite obvious it is.

The behaviour demonstrated at

https://www.notion.so/Tree-breaking-3a90e2bcd2154f4fab06a3c7...

Breaks the tree model, as 'Complete Task' would need to have both 'Subtasks' and the page itself as its direct parent.

That said… it's mostly a tree, and there may be merit to optimising for that access pattern.

@setr explained it really well. A side note, NoSQL also includes graph databases, dedicated to this type of node/relationship traversal.
We don't use JOIN for the content tree; I don't think I've seen one in any of our queries.
What do your queries look like? Are you using an ORM?
We don't use an ORM. Notion's codebase on the back-end is much more functional than object-oriented, in the sense that we have many more code that looks like `transformTheData(theData, theChangeToMake): ResultingData` than we have classes or methods.

We do lean very heavily on the TypeScript type system and try to make invalid states unrepresentable.

have you tried "data last" FP like `transformTheData(theChangeToMake, theData): ResultingData` instead? I learned this from Ramda.JS, makes it way easier to leverage currying, ex `change = transformTheData(theChangeToMake); change(theData)`