| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by btilly 1407 days ago

I think a few more concrete use cases would help.

First, a key limitation that every architect should pay attention. Redis reaches the limits of what you can do in well-written single-threaded C. One of those limits is that you really, really, *really* don't want to go outside of RAM. Think about what is stored, and be sure not to waste space. (It is surprisingly easy to leak memory.)

Second, another use case. Replication in Redis is cheap. If your data is small and latency is a concern (eg happened to me with an adserver), then you can locate read-only Redis replicas everywhere. The speed of querying off of your local machine is not to be underestimated.

And third, it is worth spending time mastering Redis data structures. For example suppose you have a dynamic leaderboard for an active game. A Redis sorted set will happily let you instantly display any page of that leaderboard, live, with 10 million players and tens of thousands of updates per second. There are a lot of features like that which will be just perfect for the right scenario.

2 comments

koolba 1407 days ago

> One of those limits is that you really, really, really don't want to go outside of RAM. Think about what is stored, and be sure not to waste space. (It is surprisingly easy to leak memory.)

You can have massive amounts of RAM these days. You’re sooner to hit big-O limits from bad architectural decisions than run out of memory. If you do get to that point you likely have enough value in your usage to justify scaling out further and sharding.

> And third, it is worth spending time mastering Redis data structures.

Bingo. The true secret to properly using Redis: understanding the big-O complexity of each operation (…and ensuring that none of your interactions are more than logarithmic).

btilly 1407 days ago

You can have massive amounts of RAM these days. You’re sooner to hit big-O limits from bad architectural decisions than run out of memory. If you do get to that point you likely have enough value in your usage to justify scaling out further and sharding.

Absolute disagreement.

It is very easily to accidentally leak a few hundred MB per week in a busy Redis system. The code will look and work fine...at first. It is correspondingly hard to track down and clean up the leak a few months later. (Particularly if there are multiple such to track down.) Yes, you can go for years just buying larger and larger EC2 instances. But that will also come with a shocking price tag.

I know of a number of organizations that this happened to. And pretty much every bad Redis story I hear about had this as a root cause. That is why I brought it up as an important consideration.

jasonwatkinspdx 1407 days ago

Yes, this matches my experience.

Redis excels as a memcached alternative with some useful operations. Where people get into trouble with redis is treating it as a persistent data store, when despite it's ability to replicate and persist, redis has some constraints you need to work within. At best think of redis as something that can hold a materialized view, but where it can become corrupted at any random time, so you'll need the ability to rematerialized it from something else. And second, you absolutely have to be conscious of how close you are to ram limits.

renonce 1407 days ago

Redis is production-ready and it has a lot of features to help you track down problems with either memory or CPU usage. For example: `redis-cli --bigkeys` will help you find the very large keys. For smaller keys that occur too often, sampling a few hundred keys should be sufficient to help you find what type of keys are taking more space than necessary.

Once you get the Redis database designed well, there is a lot of things you can do before hitting the limit where you can't install any more RAMs onto a new machine. For example, there are no more than a billion .com domains out there. Say a single record takes 100 bytes on average, consisting of the domain name and a glue record pointing to the IP of its authoritative DNS server. Then it takes just 100GB of memory to store enough information to handle all queries to .com domains in the world. It's not so hard to obtain a machine with 768GB memory these days, and 2TB machines are not uncommon.

And if you worry about the price tag - don't use EC2. You can rent a 1TB RAM dedicated server at https://www.hetzner.com/dedicated-rootserver/ax161/configura... for $600 per month. At Scaleway you can rent it for $1000 per month: https://www.scaleway.com/en/pricing/?tags=baremetal,availabl.... AWS is notoriously hard to be made cost effective.

jbboehr 1407 days ago

You can also "leak" rows in a traditional RDBMS or even a filesystem. Why is this particular notable for Redis?

sidlls 1407 days ago

Redis starts to have issues at high scale, even on sophisticated hardware, that can be quite difficult to debug without a lot of additional effort and storage. It’s not just memory, but odd behavior (e.g. randomly dropped connections) with a lot of connected clients, or hot keys/nodes in a cluster configuration, etc.

These issues can exist in any system, but in my experience it’s especially tough (relatively) to identify and diagnose them with Redis. Once you add lua script usage it can get even worse.

btilly 1407 days ago

A traditional RDBMs or filesystem is designed for high throughput and concurrency, even if some tasks are blocked on data. Additionally both have options to partition steadily growing things. If needed with old partitions being moved to tape backup while the server continues running.

Redis is a single threaded program acting against RAM whose philosophy is that it does things fast then moves to the next job. If it needs to access memory that got paged to disk, the whole server stops and waits to get it. Nobody can do anything.

Because Redis doesn't have to deal with locking and concurrency, it can run much faster on the same resources. But when concurrency is required, it is stuck because it doesn't have it.

itake 1407 days ago

> You can have massive amounts of RAM these days.

True, but I am finding that balancing CPU and RAM can be tricky. Slapping 128GB on a 1-core machine means you quickly have CPU limitations.

tomnipotent 1407 days ago

Redis is single-threaded and will have no problem saturating a 10G NIC with a single socket.

itake 1407 days ago

My concern is how fast it takes a CPU to scan through all of that memory.

tomnipotent 1407 days ago

What "scanning"? That's not how memory access works in a K/V store, and Redis does very little work that demands much of the CPU.

mnutt 1407 days ago

There are workloads that will saturate a redis instance's CPU: using it as an LRU cache, eventually you will hit the configured memory limits and adding new keys will require finding old keys to delete. Eventually it may also require redis to do memory defragmentation which can be fairly intensive.

googletron 1407 days ago

> understanding the big-O complexity of each operation (…and ensuring that none of your interactions are more than logarithmic).

This is a good idea, maybe a prompt for another post.

LewisVerstappen 1407 days ago

> Replication in Redis is cheap. If your data is small and latency is a concern (eg happened to me with an adserver), then you can locate read-only Redis replicas everywhere. The speed of querying off of your local machine is not to be underestimated.

Do you face any consistency issues with doing this?

wokwokwok 1407 days ago

It’s not a daft question, because redis clusters do have consistently issues (https://redis.io/docs/manual/scaling/#redis-cluster-consiste...), but for simple replication it is not really an issue.

btilly 1407 days ago

No. Replication time was measured in hundredths of a second, and Redis operations are atomic. So all queries got a consistent view of the data, and the lag to update was very reasonable.

YetAnotherNick 1407 days ago

It depends on definition of consistency, but it is not strongly consistent in theoretical terms[0]. But the ordering of update is guaranteed to be same, so if master is guaranteed to be internally consistent, so is the replica. And that property is enough for almost all usecases, except for maybe transactions.

[0]: https://redis.io/docs/manual/scaling/#:~:text=Redis%20Cluste....

adra 1407 days ago

During partions you may as well throw the play book away. You could have minority writes on both sides of the cluster and a big nadda to reconcile the two when they're mended. Redis is a great ssytem for what its built for and for the trade-offs that it makes to keep itself fast and lean. Redis is not CP and it will probably never care to support it. If data resiliency and correctness is important to you, Redis alone isn't sufficient. Several years ago, we tried sentinels mostly to avoid large costly rebuilds when an instance went down, and though it usually worked just fine, we certainly had single network disruptions large enough to throw off the cluster enough that required a manual rebuild.

btilly 1407 days ago

I don't trust Redis clustering to actually work. I only trust the single threaded, single server. Potentially with lots of replicas.

See the following for why I don't trust their various attempts to scale to a truly distributed system.

https://aphyr.com/posts/283-jepsen-redis https://aphyr.com/posts/307-jepsen-redis-redux https://jepsen.io/analyses/redis-raft-1b3fbf6

anonymousDan 1407 days ago

So in other words, potentially yes since there is some lag :)?

btilly 1407 days ago

For that application, there really wasn't. The results of the read were not used for writes, and the latency from when information was published to available was on par with the time a request to the master would have taken. The time from data published to available was faster than the time to switch tabs in a browser and manually check.

But your requirements will depend on the application. Financial transactions need explicit locking logic and atomic operations. Such as is provided by SELECT ... FOR UPDATE in SQL. So another application could have more demanding requirements. Which is why, in addition to answering whether I encountered problems, I gave the actual performance characteristics. So that anyone planning their application can know whether this is a good enough solution for you.