| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by jandrewrogers 386 days ago
	This doesn’t work quite as well as people assume. The first limit is simply size, you can only cram a few petabytes of NVMe in a server (before any redundancy) and many operational analytic workloads are quite a bit larger these days. One of the major advantages of disaggregated storage in theory is that it allows you to completely remove size limits. Many operational analytic workloads don’t need a lot of compute, just sparse on-demand access to vast amounts of data. With good selectivity (another open issue), you could get excellent performance in this configuration. Ignoring the storage size limits, the real issue as you scale up is that the I/O schedulers, caching, and low-level storage engine mechanics in a large SQL database are not designed to operate efficiently on storage volumes this large. They will work technically, but scale quite a bit more poorly than people expect. The internals of SQL databases are (sensibly) optimized for working sets no larger than 10x RAM size, regardless of the storage size. This turns out to be the more practical limit for analytics in a scale-up system even if you have a JBOD of fast NVMe at your disposal.

2 comments

layer8 386 days ago

> many operational analytic workloads are quite a bit larger these days.

What are the use cases where such workloads come up, aside from Google-level operations? Just trying to understand what we are talking about.

link

jandrewrogers 386 days ago

Sensor and telemetry analytics workloads in boring industrial sectors are all at this scale, even at companies that aren’t that large revenue-wise. TBs to PBs of new data per day.

link

layer8 386 days ago

What are these used for, to all have to be in a single unified database?

link

jandrewrogers 386 days ago

A large part of those workloads is stitching together a single derived model of operational reality and how different entities interact over time from the samples you get from each individual source. You need a running log of all entity behavior and interactions over time to look back on in order to contextualize what you see at the current point in time. Most of this is not pre-computable because the combinatorial state space is too large so every analytic query needs to be able to see across every relationship between sources that can be inferred.

It is essentially a spatial and/or graph analytic model evolving over time. Any non-trivial data models that capture dynamics in the physical world looks like this.

In fairness, all popular analytics platforms handle these workloads poorly regardless of if they are vertically or horizontally scaled. These workloads usually cannot be cached even in theory, so performance and scalability comes down to the sophistication of your scheduler design.

link

layer8 386 days ago

Thanks. It would be interesting to talk about the business specifics, but that would move into confidential territory I guess.

link

sgarland 386 days ago

It works to a certain point, yes, but I daresay that the overwhelming majority of OLTP needs are in the <= TB range, not PB. OLAP is its own beast, though I'll also say that most modern tech companies' schema is hot garbage, full of denormalized tables for no good reason, JSON everywhere, etc. and thus the entire thing could be much, much smaller if RDMBS was used as it was intended: relationally.

link