Hacker News new | ask | show | jobs
by menssen 3786 days ago
This is not only obviously true, I think it is also a completely reasonable calculus. They just proved that if the entire Redis cluster goes down they can get it back in 2.5 hours. It's almost certainly a caching layer, so there is no permanent data loss. If they fix the application bootstrap dependency on a Redis connection, and they add monitoring to more easily see in the future when the Redis cluster is the problem, next time that time period will probably be way shorter.

So, a very small risk of an hour or so of downtime sometime in the future which will not cause data loss, or tens of thousands of dollars a month for a failover cluster? I wouldn't replicate it either.

2 comments

>It's almost certainly a caching layer, so there is no permanent data loss.

People who use Redis rarely end up using it solely as a caching layer. It often also takes on the role of an RPC facilitator and pseudo-database. GitHub's post also mentions that their engineering team had to replicate Redis' dataset before they could get the alternative hardware running, which implies that they do need some data in there before the site is operational.

Personally one of my pet peeves is people throwing mission-critical data in Redis and acting like it's honky-dory. It happens all the time and seems really difficult to get people to not do. There's a reason we have a real ACID compliant database storing non-disposable data; it's ridiculous to ignore that just because it's easier to stuff it in Redis.

I think it's reasonable to have a dependency on a Redis server, but I don't think it's reasonable to depend on any data in particular being stored in that server. It should be used as a caching/acceleration layer for data that can be easily and automatically regenerated.

Just a thought on something I've learned over a few years. Sometimes, the most correct way isn't necessarily the best. Example here might be that the redis db is being used to store data which is constantly being read. While being in a MySQL instance might be the most correct method, the end result might actually be slower. This is just my naive guess but the point is, sometimes, given a particular context, the value of taking a hacky/less correct solution becomes great enough to use it
It's solely about the effort; it's a lot easier to just say redis.set('some_random_name') = value than it is to figure out where something should go in the schema of a RDBMS. If the data needs to persist, it needs to be written to a database that provides good guarantees about data integrity. If someone wants to load the results of a query into Redis, more power to them, but I've come across a lot of people who just stuff things in memory-backed K-V stores with the apparent expectation that nothing could ever happen to that data. Developers have told me "Well, Redis writes to disk on shutdown, right?" and acted like that was good enough for permanent storage of mission-critical data.

I have no fundamental opposition to K-V stores or NoSQL databases, but I do think most developers favor them because it's easier to stuff them with data up front. There are big tradeoffs down the road, though, which companies don't seem to understand well, and which they aren't really equipped to handle.

I unfortunately am not equipped with the knowledge about how people use/abuse redis like storage mechanisms. But that bit about how NoSQL is used as a point of upfront convenience is bam spot on. The biggest reason people have given me when I ask them why they want mongo is "easier to add columns".
Maybe they do feel it's a reasonable business decision. In that case they shouldn't be surprised if a lot of their users make the equally reasonable business decision to reduce their exposure to Github.

A lot of people have started depending on github for more than just stashing source code some place centrally accessible as they're working on it. If github takes a lax attitude toward uptime then I suspect people will start looking for alternatives.