Hacker News new | ask | show | jobs
by greenyoda 2923 days ago
> I really wonder who, today, have data that cannot fit in RAM, apart from big actors.

Keeping all your data in RAM has significant problems, even if it all fits. For example, would you want to lose all your customers' orders and billing information if your code crashed?

In addition to the relational database model, SQL databases offer ACID transactions, which are useful if you want to have consistent and reliable data:

https://en.wikipedia.org/wiki/ACID

2 comments

To be fair, using redis or elasticsearch as a main datastore is doable. Although I'm not sure they're much better choices in terms of understanding how they work.

You could summon Antirez I guess

Doesn't redis by default have recovery via the filesystem enabled?
It does - BUT depending on how often you have it syncing changes to disk you can lose data.
> For example, would you want to lose all your customers' orders and billing information if your code crashed?

There are things like WAL and snapshots. Having your dataset in RAM and querying directly doesn't exclude persisting it to disk. Read Stonebraker's "The End of an Architectural Era"[0]. Basically the OP is right in that SQL DBs were designed assuming that RAM was scarce and that asumption is no longer valid. They are innefficient for every common use case. By at least an order of magnitude.

[0]: http://cs-www.cs.yale.edu/homes/dna/papers/vldb07hstore.pdf

> Read Stonebraker's "The End of an Architectural Era"

I tried, but it lost me in section 2.3:

>It seems plausible that the next decade will bring domination by shared-nothing computer systems, often called grid computing or blade computing.

No, it doesn't seem plausible at all. This has been, by some accounts, the future of computing, since at least the 80s.

https://en.wikipedia.org/wiki/Transputer

But shared-nothing is just too darned hard to program for.

Also, main memory is still scarce. We're just barely up to the 1TB of just some of the (small, by today's standards) databases the paper mentions. Ironically, it seemed to emphasis traditional business database needs over what might happen with the tech industry itself, which has turned out to be the main driving force behind database usage (and data creation).

> exclude

As a non-native speaker, I think that preclude is the word you're looking for. Not disagreeing with what you say.