| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by derefr 2247 days ago

> monstrous, complicated, stateful streams feature

It's two data structures (which were already in Redis for other reasons!), and an automatic sequential identifier. Everything else that's "stateful" about it is client-side state—the server is still just a data-structure server. A Redis stream is basically just a Redis sorted set that's coherent in the face of clients trying to consume it paginated as other clients insert into the middle of it.

Also, the code is in one file (https://github.com/antirez/redis/blob/unstable/src/t_stream.... ); that file is ~3KLOC. It's just another Redis Module, isolated into its own set of functions with no impact on the codebase as a whole. It's just one that's so widely applicable, to so many use-cases that people were already using Redis for (through Sidekiq/Resque/etc) that it makes sense to ship this particular module with Redis itself.

Would you get upset about bloat if Postgres upstreamed a highly-popular extension? It already has nine or ten installed by default, and a few more sitting in contrib/. But, of course, even upstreamed, none of those extensions are enabled by default, adding runtime overhead to your DB; you have to ask for them, just as if you were installing a third-party extension. Same here: if you don't use the Streams module, there's no overhead to its existence in the Redis codebase.

> do people really expose Redis on the internet??

Cloud DBaaS providers expose Redis instances "over the Internet", in the sense that they're in the same AZ but not within your VPC. To the extent that you can wireshark a data-center's virtual SDN, they need to encrypt this traffic.

Even PaaS providers do things this way, since they usually lean on third-party DBaaS providers. E.g. all of the Redis services you can attach to a Heroku app are consumed "over the Internet."

If you're using Redis through an IaaS provider's offering (e.g. AWS ElastiCache, Google Cloud Memorystore) then you get the benefit of them being able to spawn an instance "outside" your project/VPC (i.e. having it be managed by them), but have it nevertheless routed to an IP address inside your VPC. That might be enough security for you, if you don't have any legal requirements saying otherwise. For some people, it's not, and they need TLS on top anyway.

> cluster stuff

Have you looked at how it's done? It's just ease-of-use tooling around the obvious thing to do to scale Redis: partitioning the keyspace onto distinct Redis instances, and then routing requests to partitions based on the key. It's not like Redis has suddenly become a multi-master consensus system like Zookeeper; the router logic isn't even in the server codebase!