Hacker News new | ask | show | jobs
by 4ndrewl 243 days ago
> Databases were made to solve one problem:

>

> "How do we store data persistently and then efficiently look it up later?"

Isn't that two problems?

11 comments

It's amusing to me that this is really quite a pedantic observation yet it's driving very earnest engagement from hackernews. Myself included. Absolutely nothing in this article is riding on if its 1 or 2 problems - it's an aside at best. Yet I'm still trying to think through if it's 1 or 2. I mean, the "and" is right there - that clearly suggests two. It's almost comical even, to say "Here is one problem: X and Y." Yet in another way it seems like 2 sides of the same coin.

I guess there is a rather fine line between philosophy and pedantry.

Maybe we can think about it from another angle. If they are 2 problems databases were designed to solve, then that means this is a problem databases were designed to solve: storing data persistently.

Is that really a problem database were designed to solve? Not really. We had that long before databases. It was already solved. It's a pretty fundamental computer operation. Isn't it fair to say this is one thing? "Storing data so it can be retrieved efficiently."

How do we reconstruct past memory states? That's the fundamental problem.

Efficiency of storage or retrieval, reliability against loss or corruption, security against unwanted disclosure or modification are all common concerns, and the relative values assigned to these features and others motivate database design.

> How do we reconstruct past memory states? That's the fundamental problem.

reconstructing past memory states is rarely, if ever, a requirement that needs to be accommodated in the database layer

Can you elaborate? That certainly seems to be what happens in a typical crud app. You have some model for your data which you persist so that it can be loaded later. Perhaps partially at times.

In another context perhaps you're ingesting data to be used in analytics. Which seems to fit the "reconstruct past memory stat" less.

Presumably the analysis will retrieve stored memory states from the ingestion phase to then perform useful calculation, or else why is there a database?
I always wanted to ship a write-only database. Lightning fast.
Back in the 80s a professor at our college got a presentation on the concept of «write-only memory» accepted for some symposium.

Good times.

Very secure!
Pretty much how eventstoredb works. Deleting data fully only happens at scavenge which rewrites the data files.
I think it was a joke. It sounds like you read it as append-only, like most LSM tree databases (not rewriting files in the course of write operations), but I think GP meant it as write-only to the exclusion of reads, roughly equivalent to `echo $data > /dev/null`
I've forgotten how to count that low. [0]

0 - https://www.youtube.com/watch?v=3t6L-FlfeaI

That would be useful for logging.
If it's write-only, and no reads ever happen, one can write to /dev/null without loss of utility.
It would be good for before going to sleep then.
Also useful for backups, so long as you don't need to restore.
It is a single problem that contains two smaller problems, but the actual hard part (a third problem, if you wish) is putting them together. If you limit yourself to solve those two problems independently you won't have a (useful) database.
You can decompose in 2 problems, because well is better, but is in fact one. Can be argued that is only this single problem:

How, in ACID way, store data that will be efficiently look it up later by a unknown number of clients and unknown access patterns, concurrently, without blocking all the participants, in a fast way?

And then add SQL (ouch!)

Off by 1 error is indeed a hard problem.
Store data persistently so it can be looked up efficiently* sounds like a single problem.
Definitely two.
"Store data persistently" implies "it can be looked up" since if you cannot look it up it is impossible to know if it is stored persistently.

The "efficiently" part can be considered a separate problem though.

Well, if you just want to store data, you can use files. Lookup is a bit tedious and inefficient.

So, if we consider that persistent storage is a solved problem, then we can say that the reason for databases was how to look up data efficiently. In fact, that is why they were invented, even if persistent storage is a prerequisite.

How about "store data in certain way." That sounds more like 1 problem and encompasses an even larger problem space.
It’s not persistent if it can’t be recovered later
Puts message in a bottle and tosses into the most convenient black hole.
Doesn't the black hole compresses the bottle beyond recovery?
>> "How do we store data persistently and then efficiently look it up later?"

> Isn't that two problems?

Only if you're creating a write-only database, in which case just write it to /dev/null.

This is analogous to an elevator that’s unidirectional
One that lets people enter. We will figure out exiting later, with exiting on a different floor as a stretch goal.
Or just a paternoster
> Isn't that two problems?

No, that would be regexes.

You're thinking of regex