Hacker News new | ask | show | jobs
by hinkley 1192 days ago
I can guarantee it’ll never be able to support infinity records. I can also guarantee that most data structures can’t even support a googol of records and never will. Some won’t ever support a quintillion records due to complexity, and quantum won’t fix all of those. There may be structures that support this, but they aren’t the ones we use today.

And the reason is that complexity theory is on shaky ground these days and needs a revision. Forty years and six orders of magnitude ago we treated a bunch of operations as constant , which is a clever fiction that is blatantly obvious now,but some of you knuckleheads continue to not look at. Most C terms are in fact stair-stepped log(n). Every time log(n) doubles and crosses a threshold, the cost of the operation doubles, and therefore is in log(n), not C. When you’re talking billions of records that distinction is already fairly obvious, and people dealing with trillions have to architect for it.

Memory access, as it turns out, is sqrt(n), not C, and with magic hardware might some day reach the cube root of n, and don’t count on it getting any better until we can bend space. With that sort of memory access a lot of O(1) algorithms perform worst than log(n) operations.

As you go into 128 bit addressing the difference between nlogn and n^3/2 * log(n)² becomes impractical. There’s a wall there, and we won’t get past it without completely different algorithms or desktop FTL technology.

The next order of magnitude of n will require a complete rewrite of that section of the code, so in practical terms the n only goes up in a new program, not the old one, and therefore the old program cannot and never will handle a thousand times more data than it was designed for.

2 comments

You're getting deep into the realm of data sets that don't fit into a computer. I don't think you're talking about solving the same problems.

> Forty years and six orders of magnitude ago we treated a bunch of operations as constant , which is a clever fiction that is blatantly obvious now,but some of you knuckleheads continue to not look at. Most C terms are in fact stair-stepped log(n). Every time log(n) doubles and crosses a threshold, the cost of the operation doubles, and therefore is in log(n), not C.

Memory latency hasn't really budged in 25 years and is significantly faster than it was 40 years ago.

Pointers have gotten bigger, but they're not that big compared to data and they're only one notch away from the maximum you could ever use in a single computer.

Where are we seeing these slowdowns?

> I can also guarantee that most data structures can’t even support a googol of records and never will. Some won’t ever support a quintillion records due to complexity

A B-tree will do fine.

> Memory access, as it turns out, is sqrt(n)

Not in real single computers.

And if our baseline is 1983 latency, it would take more than unrealistic amounts of circuits and size for the speed-of-light delays to drag modern tech down to equal it, let alone be worse.

I agree that actual infinity is impractical, but “zero, one, and out of memory” gives the wrong idea, because it implies you might limit things to the amount of memory currently available or available soon. "Infinity" is better at getting across the idea that your code is not allowed to bake in any limits.

For giggles I did the math.

n^3/2 * log(n)²

The time required for 1000n is 3 million times more than n. If anyone makes a computer that is 6 orders of magnitude faster than current computers, none of us here will be alive to see it.