Hacker News new | ask | show | jobs
by kentonv 2085 days ago
> Would I need a hierarchy of durable objects to store a 100kb collaborative text document? What about 10MB?

The bottleneck is going to be more CPU than memory or storage size. So the question is, how many users are editing, what rate of events does each generate, and how complex is the event handler?

Let's say it's actually a plaintext editor, with the 10MB text file represented in memory using a reasonable data structure allowing O(1) insertions and deletions. Clients send a stream of keystrokes to the server. Server writes document out to disk periodically, not on every keystroke. Then I would expect each keystroke to take less than 1ms of CPU time to process, therefore at least 1000 per second could be processed by one thread. Let's say people type 10 keystrokes per second, then you could have 100 users actively typing at once? This is just my intuition, though.

> and then it might be better to just use a single point of failure like a database..

Incidentally databases will also hit scaling bottlenecks if you have too many requests hitting the same row. Under the hood, the database has to do exactly what Durable Objects do -- the row will be owned by one "chunk" which has to serialize all changes (making it effectively single-threaded).

So "use a database" doesn't necessarily solve your scaling bottleneck. In fact, it's likely to be worse, since the database chunk is not running app-specific optimized code.

1 comments

> So "use a database" doesn't necessarily solve your scaling bottleneck.

Absolutely, it won't :D

I'm just saying that if I do a postgres database on say heroku, I have a clue what it can handle. I get the specs, it has this much RAM, this many CPUs and this much storage.

I also know that I'll be the only user of said server.

With other hosted services like S3, dynamodb, azure tables, etc, the documentation features "scalability targets".

S3 I can see how may reads / s a bucket can handle, and how it scales up. On azure tables I know a table can handle 100k writes / s (or so).

If I send a lot of messages to a single durable object, will it scale to the point where it gets a dedicated node? (what is the approximate definition of node) Will it migrate automatically? (what temporary degration might I see).

Can I have a 1GB durable object with 1 transaction every 1 hour? What about a 100 GB durable object with a 1 transaction every day?

Can I have a 1MB durable object with 10 messages per minute?

Or is it measured in compute seconds, clock cycles?

Liek can I have 10kb durable object with 10 messages / s each using 0.5ms CPU time?

Similarly, how does it look when I increase CPU time per message, number of messages or size of the object, where are the limits?

From reading the documentation one might think one could store a 100GB git repository as a durable object. Sure each commit takes time, but I have few commits / hour.

(I assume a 100GB durable object won't work, but I can see any limits or scalability targets in the documentation)

Just to clarify, I don't expect an answer to the question above, I expect documentation to feature some "scalability targets" and "limits". Something that gives me an intuition about what a durable object can handle. - So I know whether to shard my use-case or not :D

On topic: I think durable objects is really cool, message passing seems like the right model for scalable cloud computing.