Hacker News new | ask | show | jobs
by mjevans 449 days ago
K:V - maps / dictionaries can be the correct tool for some jobs.

I think I'd prefer to stop calling _large_ resources that are only K:V a 'database' though.

A 'database' shouldn't require SQL, but a distributed filesystem, however similar, isn't quite a database.

1 comments

Rebuttal: a filesystem is a database.
I was immediately offended by your rebuttal and gave it some thought. It is an interesting definitional boundary.

Perhaps the distinction is more pragmatic than fundamentally technical. We typically use the term "database" to describe systems designed primarily for structured data management with query capabilities, while filesystems optimize for hierarchical storage of opaque binary objects.

Therefore a KV is not a database either.

Filesystems are more like a graph database than a pure key value store.

I think 'database' is a term with multiple (related) meanings depending on context.

Another example is the term 'colour'. Depending on context, it sometimes makes sense to call black and white and grey 'colours', and sometimes it's better to treat them as something else.

Filesystems are structured and have queries.
On "IBM i", all filesystem units are also objects. They have relationships, as well as hierarchy.
Does that make Postgres not a database too? It can store binary blobs
There is more to the above comment than just "opaque binary objects".
Storage Engine (using a store type) -> Data Store (ordered by data model) -> Database (providing semantics & management)

k->v is a data store (using disk|inmem|networked storage engines).

A database is a complete system for management of data. They come (or used to come) in various data model flavors: hierarchical, graph, relational, etc.

Huh? Offended?!

Filesystems present a durable way to store hierarchical binary/textual data. They normally have a very well-defined api used to provide a primitive query language. Sounds a lot like a database, no?

Even internally they are very similar: journalling, paging, tree indexes are normally present in typical popular implementations.

In some classic OS-s there is no separation at all between the concepts of a database and a filesystem.

In a way, a generic durable database can be though of as a special kind of a filesystem. And vice versa.

Related philosophical question - is a spreadsheet a database?

It's certainly a base for data and can be used to implement all core concepts of relational calculus but it isn't designed for such and doesn't do so with performance in mind. Conversely, filesystems are often implemented using B-trees as many RDBMSes are but aren't designed for many of the operations one might typically ascribe to a database.

Nomenclature is tricky... how does that saying about the two hardest problems in CS go again?

> Conversely, filesystems are often implemented using B-trees as many RDBMSes are but aren't designed for many of the operations one might typically ascribe to a database.

Filesystems are not a relational database, sure, but the word "database" in the context of computer systems, computer science, IT and technology in general doesn't mean "relational database".

Filesystems are definitely a hierarchical database, which is different to a relational database.

I would hazard to say that anything that can be queried in a way more efficient than a full scan is a database, or can be used as a database. A spreadsheet, a filesystem, a hash table (aka "KV store"), even a sorted list, like a library catalog.
Spreadsheets: no index, not a database.