Hacker News new | ask | show | jobs
by bassp 465 days ago
I agree with the author 100% (the TanTan anecdote is great, super clever work!), but.... sometimes you do need Redis, because Redis is the only production-ready "data structure server" I'm aware of

If you want to access a bloom filter, cuckoo filter, list, set, bitmap, etc... from multiple instances of the same service, Redis (slash valkey, memorydb, etc...) is really your only option

3 comments

Postgres has Bloom filters: https://www.postgresql.org/docs/current/bloom.html

It also has arrays, sets, and bitstrings, though for the latter you can just as easily (and with less space consumed) map it in your app, and store an integer.

Yes, while the default idea of Redis might be to consider it a key/value cache, the view of the project itself is definitely about being a "data structure server" - it's right at the top of the https://github.com/redis/redis/blob/unstable/README.md and antirez has focused on that (I can't find one quote I am looking for specifically but it's evident for example in discussion on streams https://antirez.com/news/114). Although I've definitely seen it be used just as a key/value store in the deployments I'm familiar with ¯\_(ツ)_/¯
All of those can be serialized and stored in an RDMS. You don't need Redis for that.
They can (and that's probably the right choice for a lot of use cases, especially for small data structures and infrequently updated ones), but serializing and storing them in a database requires you to (in your application code) implement synchronization logic and pay the performance cost for said logic; for instance, if you want to `append` to a shared list, you need to deserialize the list, append to the end of it in your application code, and write it back to the DB. You'd need use some form of locking to prevent appends from overwriting each other, incurring a pretty hefty perf penalty for hot lists. Also, reading an entire list/tree/set/whatever back just to add/delete one element is very wasteful (bandwidth/[de]serialization cost-wise)
> for instance, if you want to `append` to a shared list, you need to deserialize the list, append to the end of it in your application code, and write it back to the DB.

this seems like a classic case of impedance mismatch, trying to implement a Redis-ism using an RDBMS.

for a shared list in a relational database, you could implement it like you've said, using an array type or a jsonb column or whatever, and simulate how it works in Redis.

but to implement a "shared list" in a way that meshes well with the relational model...you could just have a table, and insert a row into the table. there's no need for a read-modify-write cycle like you've described.

or, if you really need it to be a column in an existing table for whatever reason, it's still possible to push the modification to the database without the heavy overhead. for example [0]:

> The concatenation operator allows a single element to be pushed onto the beginning or end of a one-dimensional array. It also accepts two N-dimensional arrays, or an N-dimensional and an N+1-dimensional array.

0: https://www.postgresql.org/docs/current/arrays.html#ARRAYS-M...

Sure, but that’s not what the person responding to my original comment was suggesting :). They suggested that you serialize entire data structures (bloom filters, lists, sets, etc…) into a relational DB to get redis-like functionality out of it; I chose a list as an example to illustrate why that’s not a great option in many cases.

You’re right that managing lists in RDMSes is easy-ish, if you don’t have too many of them, and they’re not too large. But, like I mentioned in my original comment, redis really shines as a complex data structure server. I wouldn’t want to implement my own cuckoo filter in Postgres!

Doesn't a Postgres table fulfill this? You don't shove the whole list into a single column, you make a separate row per entry. This also works as a map or set.