Hacker News new | ask | show | jobs
by FanaHOVA 2229 days ago
> Is it "slow" enough to matter? Probably not until you get to a medium scale.

Phew, I'm glad GitHub and Shopify's scale is still small.

5 comments

Pedantry aside, we've reached a point in our industry where we can do a lot with horizontal scalability.

I mean, every programmer who funnels through university understands map reduce, and that helps on multi-core threading up to system job running.

But there is a limit, usually in the persistence and caching layers. What you'll find is that those "large scale deployments" are going to have a -lot- of internal cache systems and I can pretty much guarantee that the services running those caches and persistence will not be written in ruby.

You can make anything* scale, but how many CPU cycles you need to burn to get the functionality you want is a matter for the finance department.

If you're running in a lossy business, you can bet that those CPU cycles will begin to cost more than developer velocity is worth, because servers are an eternal and ongoing cost.

On the flip side if you make more money than the infra+devs cost, then nobody is going to hound you for wasting 2x 3x the cost. Because "it's the cost of doing business" is easier to justify when you're cash positive.

> services running those caches and persistence will not be written in ruby

So what? What's wrong with using software like redis for cache, for a very small (but important) part of your business? I bet java apps use redis as well, and redis isn't written in java. So?

Is this an honest question? I honestly can't tell and I am not saying it to show disrespect -- just wondering if you are sarcastic.

Erlang/Elixir have built-in caches that respond in the matter of 30-150 nanoseconds.

Why would you need an external service for that? It's adding complexity -- and likely hosting costs.

Isn't it self-evident to you that adding Redis as a caching layer to your stack is a bandaid to a deeper problem?

Local caches and local node caches are both very useful. (That's why Redis 6 introduces this https://redis.io/topics/client-side-caching), but anyway from what I saw in the past, the major speedup of using Redis in such a context is that you want to use a shared very fast view that is global in nature. A simple to understand, but good example, is the leaderboard problem in multiplayer games that have million of users (Facebook games and such). Even if you have a local cache, and even if you have an additional store where you record the high score of each user, you need a global and very fast to update view of all the sorted scores, to tell the user its rank, users nearby, the rank of their friends. There are a number of problems like that that require to use different data structures and a global view. The problem is that using Redis with the Memcached mindset, will always severely limit the potential benefits.
Yes. Redis has very valid use cases.

You're quite right: people using it as a mere cache don't get most of its benefits.

> Why would you need an external service for that? It's adding complexity -- and likely hosting costs.

We're also dependent on mysql, are you gonna implement that in Elixir as well? Redis is a great piece of software, and it's a real SHARED cache, so it could work for sessions or other small state management you sometimes want to remember for example. What you described won't work for that.

Both are not equal at all. Redis you can definitely do without. A database you can't skip in most apps.

Sessions work quite fine in Elixir's local cache as well. :)

> Redis you can definitely do without.

This seems to suggest otherwise https://hex.pm/packages/redix, Why is the redis client is so popular in Elixir world? For such a small community 3+ million downloads is huge.

Interesting but not surprising; I'm being regularly pleasantly surprised by the Erlang/Elixir ecosystem :) . Can you precise what you're talking about when you say "Erlang/Elixir have built-in caches" ?

What's the name of the concept and where in the typical stack does it fit? Is this https://blog.appsignal.com/2019/11/12/caching-with-elixir-an... or something else? Care to share a few link to docs/articles? Thanks.

Yes, ETS is the usual go-to but there's also `:persistent_term`[0] for very rare (or never) changing caches.

There are libraries that combine Erlang/Elixir's caching mechanisms in an attempt to achieve the best performance for most scenarios[1] as well.

Technically, ETS is not perfect because it copies data from its mutable cache to the process that requests it. But it's still orders of magnitude faster than outsourcing that to Redis.

[0] http://erlang.org/doc/man/persistent_term.html

[1] https://github.com/gyson/ane

Thanks!

I read in http://erlang.org/doc/man/ets.html that "Each table is created by a process. When the process terminates, the table is automatically destroyed. Every table has access rights set at creation."

-> So, caching is local to each worker node/process? Do nodes/processes communicate between them to synchronize their respective caches, or is it local by design?

If local by design, that wouldn't exactly cover the use case of "Redis in front of an army of $other_language workers", correct? And I guess it's accepted as costing slightly more cache misses, but with the benefit of more decentralization / node independence / resiliency, right?

My point is that for all the talk of how performance doesn’t matter and that we can scale ruby, the real heavy lifting is not handled by ruby.

It’s not a “problem”, but if you’re going to talk about large companies scaling something you need to understand that they’re likely scaling it in spite of limitations.

Largely, some systems don’t scale too well (latency on network accessible cache, throughput in persistence layers such as databases) so a lot of application layers will lean on those things heavily and they are exclusively written in relatively “faster” languages.

> I mean, every programmer who funnels through university understands map reduce, and that helps on multi-core threading up to system job running

I wish this were true. There is a high degree of variability between skillsets from different American universities, even in my state of Washington.

Other than that, I agree whole heartedly.

> Phew, I'm glad GitHub and Shopify's scale is still small.

Phew, good thing they can afford to burn a lot of cash for hosting.

I guess that's so though it's interesting in the early days Github didn't:

>...a web startup like ours doesn’t need any outside money to succeed. I know this because we haven’t taken a single dime from investors. We bootstrapped the company on a few thousand dollars and became profitable the day we opened to the public and started charging for subscriptions.

Tom Preston-Werner in 2008. I guess the thing is not how much the hosting costs in absolute terms but how much it costs relative to what customers are prepared to pay for the service.

Once a business grows to certain size and beyond the shareholders don't want to hear anything about tech rewrites. They are happy the product is working, they are not alarmed by slow response times (which GitHub has plenty of on my gigabit connection that streams 4k@30fps without any lag) unless there's a big customer churn, and hosting costs, even if big-ish, to them are just the cost of doing business.

My point is, yes, you are right -- but there's a lot of conservatism involved once the business gets to a certain size. Nobody cares about improving anything from then on (which usually leads to the now-giant to start steadily losing relevance; GitHub is quite far from that but we all remember Microsoft, right?).

I recall Shopify hired a dev who works on a project, FaastRuby, written in Crystal language for their web app.
I guarantee the tradeoffs matter to them
I’ve seen videos on Twitter of the Shopify CEO saying that Ruby’s performance literally doesn’t matter to them. That it’s easy enough to scale it horizontally and the benefits outweigh the cost.
Probably helps that Tobias also codes, not to mention contributed to Rails quite a bit.
Shopify is actually replacing slow ruby code with Go. And yes they ACK that Ruby is too slow. Also Shopify is a big monolith so it does not help.
A monolith doesn't necessarily mean slower, even with Ruby. There are lots of opportunities to run less code on each request, do some work with the db, and to split off measure bottlenecks into services in a faster language. It's often good to build things quickly, find fit, and then carefully measure before you introduce calls over the network.
A monolith is a problem which they are also breaking down, when you want to make something faster it's easier to target the service that does it vs the giant app.

Edit: getting downvoted by people that don't work at Shopify.

I work at Shopify. I can confirm that what you are saying is totally false.
Lol none of these statements are true