That's not on a per-host basis. Shopify's design is, quite fortunately, one that partitions really well, as each store is completely independent of each other.
Each store can be assigned to one pod, each pod can have as many hosts as it takes to optimize the use of a database instance, and then you can add more pods as the need arises.
Edit: to be clear, that's not to say Rails can't scale. It can. It's just that it doesn't need to- you can scale anything with enough partitioning.
Well, Shopify is investing in optimizing Ruby, so they believe Ruby was the component with the most opportunity for improvement. And the results are showing there was indeed lots of places where improvements could be made.
That's like saying the leading F1 team is investing in optimizing side mirrors, therefore they believe side mirrors are the component with the most opportunity for improvement.
The fact is that performance oriented organizations optimize everything unless they have math telling them it isn't worth optimizing.
> The fact is that performance oriented organizations optimize everything unless they have math telling them it isn't worth optimizing.
Most companies don’t have unlimited budgets. Performance-oriented organizations profile and then spend money where profiling tells them to. Shopify isn’t hiring people to contribute to MySQL or Redis internals. They hired a full team to work on Ruby internals, not just creating YJIT, but also on CRuby’s memory layout, hiring the lead on TruffleRuby, and funding academic programming language research on Ruby.
No company has an infinite budget to “optimize everything”. It is clear where internal performance testing pointed Shopify (at Ruby) and with double digit gains being extracted year after year, their profiling didn’t lie. And other Ruby on Rails shops are seeing similar double digit performance wins, not on fake benchmarks, but on actual page load times and traffic that can be handled by a server.
Sort of. Twice as many rails hosts means more DB connections which generally means more load/memory on the DB or more load/memory on the external connection pooler.
It's only a bit of incremental load, but it's easy to overlook how many other systems need to run to make Rails scale.
No he’s his examples he’s talking 50 stores over 10 machines vs 50 stores over 5 machines. Both would require the same DB count, but the second would save on server costs for stores.
I'm sorry, but this is one of the silliest nit picks ive ever seen on this site. Of course 1 million rps and 3TB/s isn't coming from a single host. 3 TB/s is far beyond the throughput of any network I've ever seen, considered or thought of, short of maybe a data center (and I don't work in that domain). 1.3 million requests per second is far beyond the capacity of pretty much any hardware available right now.
Async Rust has not been around long enough for lots of examples. Go has plenty of them. Rust is powering big parts of CloudFlare along with Go, which is way bigger scale. AWS also has big parts done in Rust.
> And how many instances would be required if Go/Rust would have been used?
Zero. Because Shopify would have waited until Rust came out in 2015, instead of launching in 2006, and they would never have gotten off the ground and been another failed techbro startup that instead of getting shit done, bikeshedded over languages.
PHP and Ruby apps have generated far more revenues than all the Rust and Golang code combined.
Rails was a great choice for Shopify, GitHub and Stripe, no doubt. No one is questioning that.
GP’s sarcastic “rails doesn’t scale” implies that it would also be a great choice for people starting afresh in 2023. The reply asks for a comparison with other languages popular in 2023, especially ones that are known for being more performant (lower memory and CPU consumption, lower latency).
And that’s when you’re dragging the conversation back to 2006. It’s not 2006 anymore.
Rails was initially too slow for Github, so they forked it, didn't use "Rails" for a while [1], lost literal engineering years to upgrading it (same for Shopify), and now Github has an engineering department dedicated to working of the Rails master branch directly, which is huge engineering overhead and a problem and solution that shouldn't exist. Github co-founder Tom lamented using Rails at Github and has stopped using it [2].
If you're Github or Shopify and can throw (waste?) engineering years at solving a framework specific ecosystem nightmare problems, and have the klout and runway to hire core Ruby and Rails maintainers, then you're probably in a highly unique situation and could use any framework you want.
The rest of us don't see Rails as a great choice for Github. Doubt and questioning.
Yes, not ideal. But we can’t know what would have happened if they had chosen a different language in 2007. Did they have the option of an efficient, well supported language that continues to be used in 2023? Maybe Java, although Java languished for years before development picked up again. C# is also a candidate.
But one thing we can’t measure - how many candidates chose to join Shopify and GitHub because they were keen to work on Ruby? Java had a reputation for being boring, while Ruby was fun and exciting. Their success was possibly tied to this, but we’ll never know for sure.
In 2023 the calculus of what language to choose is different. But these companies are just glad they succeeded while others didn’t.
You're implying that those languages made actual building products easier. I think we know by now they didn't. Go is a language which preaches building your boilerplate than reusing it. Produces very little of the economies of scale requires to build compelling products. It has other advantages, and the single binary thing captured a niche in infrastructure software, but that's it outside of Google. Rust is still young, jury is out, but it's already considered a big and hard to learn language, and that's not going to change soon. It'll eventually find a niche.
Ruby is still a great way to start it up. Consider that in 2006-2008, it's deployment story was horrible. Since then, the ruby ecosystem bootstrapped lockfiles, 12 factor app manifesto, and a lot of the conventions we all take for granted nowadays. And while there are certainly enough arguments to bikeshed on, its still a rock solid ecosystem.
Google has a lot of revenue. Pinterest, Hashicorp, Uber, Twitch, Dropbox, etc. all have a good amount of golang and collectively have a lot of revenue. It might need a few more years to tip the scale, but it's closer than suggested here.
> Or they could have used something available in 2006, like C++, Java, .NET/C#, OCaml, Haskell, D.
Going for .NET/C# would have likely limited anyone to using mostly Windows Server for their infrastructure. Not that it's a bad OS, but .NET Core was released only in 2016 and although Mono came out in 2004, sadly it never got the love it deserved and was rather unreliable (otherwise we would have seen way more cross platform development before .NET Core). Oh, also, turns out that LINQ (which is pretty cool) was only released in 2007, though that still puts them a bit ahead of Java I guess, although I can't comment on when it landed in Mono.
Going with Java would have meant using something like Java 6, whereas the first truly decent version (in my eyes) was Java 8, which came out in 2014. Of course, the older language version and runtime wouldn't be a huge issue, however projects like Spring Boot only came out in 2014 and before then most people would either use Spring, Java EE (now Jakarta EE) or a similar framework from back then. I've worked with both and it wasn't pleasant - essentially the XML configuration hell with layers of indirection that people lament.
I mean, either would have probably been doable, but it's not like other stacks are without fault (even the ones I cannot really comment on).
C# in 2006 was a joke, probably worse than Rails in performance. This was the webforms era and old EF - meant for enterprise customers with a couple of hundred active users max... ASP.NET being a competitive/performant framework is a very recent development (since core basically which became usable past 2.0)
Haskell, OCaml and D are niche languages, probably aren't mature enough now to use for a production system that needs to scale (in terms of org growth and building complex systems).
Java web frameworks were also terrible in 2006 (this is the Java era that gave Java it's reputation) and the only thing worse for productivity I can think of is C++ hahaha ...
All of them were faster and used less resources than a very slow interpreted language, by having JIT and AOT compilers, state of the art GC and great IDE offerings, even the niche ones had better tooling (Leksah and Merlin, versus nothing).
That was then and this is now. If you are building under endless VC money go ahead burn it. Most of us however do not have endless stacks of money to burn runing our code.
If you think time-to-market, and overall cost are going to be improved by building your vanilla website in Rust or Go, vs Rails, then I think you may be surprised
Yeah it’s nearly the main benefit of Ruby/Rails that you can spin up an mvp of your company in like a week, and have a decent feature set within a couple months.
The trap is when you start growing and it is hard to change. Because the features that took 1-2 months in RoR might take 3-4 months (or more!) to port to another language, and do you really want to stop your working business when it isn’t a problem?
Because Rails performs totally fine at small-mid startup scale. It’s only when you start getting a couple years old with lots of users that it starts to bite you. But at that point you already have gotten further than 90% of startups ever even make it. And at that point, honestly there are solutions for that too, like gradually pulling the poor-performing bits out into faster languages.
Writing this as someone who works for a startup that uses RoR, and I’ve seen it blow up over several years. I curse RoR daily because it pisses me off, but I don’t think this company would’ve gotten this far if it didn’t have the RoR speed at the beginning.
So are you better off starting your company on Go/Rust/Java? Maybe. But if getting to market fast will help you win, it’s hard to beat RoR.
> Most of us however do not have endless stacks of money to burn runing our code.
This is such an absurd take. Do you really think startups lose runway because of the runtime performance of their code, and not failing to achieve PMF, overhiring, or spending too much on stupid techbro bullshit?
These two points are entirely unrelated. Scaleability in that meme is not considering horizontal scalability, which approaches infinity for literally any language/framework. It only makes sense in the context of vertical scalability, and gross req/sec offers no insight into whether or not that's true.
I can't see how vertical scalability even matters for rails You can just start more instances. It's not like two active web requests need to interact with each other.
You don't even have to write performant code. You can just start as many instances of the rails app as you want. The bottleneck is usually the database.
Partly, you need mobile now, so any Rails stuff is likely to be back-end and hidden. Plus big investors like to go for the exciting stuff.
But if you're looking at companies aren't household names (taking smaller amounts of investment), there are lots out there.
Syft (recruitment) were founded in 2016 and have revenues of over $100m per year - although that's partly due to acquisition by a larger competitor, so when I just looked, separating their valuation from the group wasn't immediately obvious.
I've freelanced and contracted across a few niche industries (construction, print, airport signage management!) where I was building something against competitor software that I discovered was at least partially built with Rails. Those big players in each niche would have revenue in the tens of millions and from what I could see, very small technical teams.
But anyone outside those industries would never have heard of these companies.
No exactly a fair question because Shopify is much older and is valued at 70B. For a company to have done it in half the time would have been impressive regardless of tech whereas on average it takes 7 years to become a unicorn.
I do know that Aircall is relatively young, on a good trajectory and runs Rails.
> Since Ruby 3.3.0-preview2 YJIT generates more code than Ruby 3.2.2 YJIT, this can result in YJIT having a higher memory overlead. We put a lot of effort into making metadata more space-efficient, but it still uses more memory than Ruby 3.2.2 YJIT.
I'm hoping/assuming the increased memory usage is trivial compared to the cpu-efficiency gains, but it would be nice to see some memory-overhead numbers as part of this analysis.
This is a particularly valid concern given ruby+rails seems quite memory inefficient to begin with. I've sometimes had smallish apps on 500mb heroku dynos crashing due to memory slowly climbing and eventually slowing things down as the dyno uses swap, and eventually 500mb of swap. IME ruby+rails doesn't seem to free up memory after it uses it, and that causes problems as the hours go by until the pod/dyno crashes or is restarted.
Ruby processes don't return the memory to the system,they reuse memory already allocated. This is for efficiency - allocating and freeing system memory isn't free. Even if it did, your peak memory usage would be the same. It doesn't allocate memory it doesn't need.
If your memory usage doesn't plateau you have a memory leak which would be caused by a bug in your code or a dependency.
But 500 to 1gb of memory required for a production rails app isn't unusual. Heroku knows this, which explains their bonkers pricing for 2gb of memory. They know where to stick the knife.
> Ruby processes don't return the memory to the system
That is not correct. Ruby do unmap pages when it has too many free pages, and it obviously call `free` on memory it allocated once it doesn't use it.
What happens sometime though is that because of fragmentation you have many free slots but no free whole pages. That is one of the reason why GC compaction was implemented, but it's not enabled by default.
But in most case I've seen, the memory bloat of Ruby applications was caused by glibc malloc, and the solution was either to set MALLOC_ARENA_MAX or to switch to jemalloc.
I'm correct in practice. There are scenarios where ruby might free memory, but ruby is mostly used for rails, and you won't ever see that under a standard rails workload. It will plataue and stay there until a restart. When people see this they think it's a "bug" or a "leak" but it isn't.
On the last fairly large rails app I tried to use jemalloc on there was no change in memory usage. I believe that advice is a bit outdated. Also note using jemalloc doesn't cause memory to be freed to the system. It reduces fragmentation, at the cost of cpu cycles. There's no free lunch.
Yes, because extra empty pages are released at the end of major GC, which is occasional, and most web application will cyclicaly use enough memory that they will stabilize / plateau at one point.
> I believe that advice is a bit outdated.
It absolutely isn't, your anecdote doesn't mean much compared to the countless reports you can find out there.
> Also note using jemalloc doesn't cause memory to be freed to the system.
That kind of thinking is a bit flawed unfortunately. You might hit your peak for 20 minutes a day but you’ve provisioned your system for that temporary worst case for the entire day and other services are paying that penalty. If it’s the only thing you’re running, maybe. But in practice there are other things you want to run on the machine to improve utilization rate (since services are not all hitting their peak simultaneous generally)
That’s why good modern allocators like mimalloc and tcmalloc return memory when they notice it’s going unused, so that other services running on the machine can access resources. And this is in c++ land where things are even more perf sensitive.
Theoretically virtual memory and swap solve this problem really well. The OS is free to write the unused pages to disc to let other programs use the real memory.
Swap is horribly expensive and most hyperscalars run their servers without swap and set per-process memory limits, automatically killing workloads that go above their threshold..
What if the other thing you're trying to run runs at the same time that your rails app is using peak memory? You have no choice but to have enough memory for peak load.
But if you really do need to cheap out you can generally configure your app server to kill idle worker processes, or bounce them on a schedule to return memory to the system, and hope.
So that’s generally not very likely. You’re going to have some time of day effects that are shared but true “peak” tends to be service dependent rather than something all your services experience simultaneously from what I’ve seen (YMMV).
Killing “idle” processes is also extremely expensive because you have to restart the process, reload all state, and doing graceful handoff is tricky.
It’s good to have graceful handoff for zero downtime upgrades, but I still say having your allocator return RAM is the cheapest and easiest option and something good modern allocators do for you automatically.
> If your memory usage doesn't plateau you have a memory leak which would be caused by a bug in your code or a dependency.
Extremely bold claim for a framework the size of ruby on rails. I would trot out my own evidence but the receipts are lost with time.
Also—why isn't the allocation behavior tweakable at runtime? Seems pretty trivial with no downsides. It's not difficult to think of a scenario where a non-monotonically-increasing-heap-size is desirable.
Many types of memory leaks are simply because you're holding on to data you don't need to hold onto anymore. Languages cannot prevent this, at least not that I've seen.
Was it difficult to switch? What were the downsides / tradeoffs? (I read about jemalloc recently but don't know enough about it to confidently pursue it, but may try it on a small app if it's straight forward).
OOC, why isn't this part a ruby default? Isn't it always better to be more memory efficient. (I'm trying to understand what the trade offs are, if any)
Have you compared it against newer allocators like mimalloc or the rewritten tcmalloc (not the one in gperftools)? Jemalloc is a bit long in the tooth now.
Am I really going to have to get out the premature optimization quote?
Most businesses fail. Those that don't fail, usually don't have interesting scaling issues. (You can go a really long way on a boring monolith stack.)
So in most cases, whatever gets things out into the world and able to see if the business can be validated makes sense, and then you optimize later. A nonscalable stack that you can iterate on 50% faster is more likely to produce a viable company than a more scalable stack that's slower to work with.
If you're a hired employee, it's easy to forget that the place you're working for is already a big exception just by the virtue of it grew large enough to hire you.
This hints at a false dichotomy. One that especially Ruby and Rails keep afloat.
Productivity and Scalability(in performance sense) aren't opposites.
Take Bash. Performs bad and is a guarantee for terrible productivity in a large category of software. But perfect for a niche. Take Java. Performs better than many, and allows for good productivity (if you avoid the enterprise architectures, but that goes for any language).
Or take Rust. Productivity much higher than most C/C++ and in my case higher than with Ruby/Rails, and also much more performant.
It's a false dichotomy in theory. It's mostly not in practice. And that was far truer in 2006 when Shopify got started. Then there really weren't any modern web frameworks in performant languages.
Primarily it's not the language that makes people more or less productive, though it does have some influence. It's mostly the frameworks in those languages. And traditionally the most modern / full-featured web frameworks haven't been in systems languages. The major counterexample at the moment (while still obviously not a systems language) is that modern JS VMs are actually really fast, so while I don't love JS, it does hit that sweet spot at the moment of performance and mature frameworks.
Also, I've never worked in Rust, but am mostly a systems programmer, and while I understand that Rust is supposed to be easier than C or C++, I'm skeptical that it's as easy to work with as higher level languages, or that you could throw most web developers into Rust without some serious additional learning.
That's another problem I have in this narrative. Productivity isn't measured by throwing an inexperienced developer at something and then looking how fast they get stuff done. That's learnability.
I'm an experienced Rails developer (some 15 years in) and my productivity has plataued for years now. I've been doing Java and Rust work for years too now. Web and application dev. It took years, but my productivity in both Java and Rust, on anything that lives longer than 6months, has vastly surpassed that of my Rails.
Productivity of a senior, or experienced dev, of a (large) team, of a team with high turnover, of a project over decades, all that is productivity too. And in all those categories, Rails isn't great.
We're talking past each other because we're arguing different things. If I understand you, you're saying that you can avoid technical debt by using tools that are intrinsically more performant, and that skilled developers are more productive with more advanced tooling.
That's all correct.
But the point I'm making is that if an MVP isn't accruing technical debt, it's over-engineered. Most of them will be thrown away, or rescoped, and so taking on technical debt is an advantageous strategy: you only have to pay the technical debt on the few survivors.
Shopify at its offset was a CRUD app (fun fact: it started as a snowboarding shop), and in 2006, Rails was a great choice for that.
Your notions are fine for an established company building a piece of infrastructure they're certain they'll need. But that's not what Shopify was, and it's not the spot most startups picking a framework are at.
Your thing about developer quality is kind of meh. Building the first versions of a shopping platform isn't rocket surgery. You don't need Anthony Bourdain to make a sandwich. Particularly if you're not sure anybody wants a sandwich.
Not in my case. Rust, for me, is much better for productivity than my other major languages Ruby and JavaScript.
The main reason is type enforcement, which is why -for me- typescript is much more productive than JavaScript. A large category of bugs simply won't exist (are caught at compiletime). With Ruby, I'd have to write hundreds of edge-case unit-tests just to cover stuff that, with Rust is enforced compile-time for me.
The other reason is runtime speed. A typical Ruby test-suite takes me minutes to run. A typical Rails test suite tens of minutes. A typical Rust test-suite takes < a minute to compile and seconds to run. I run my tests hundreds of times per day. With a typical Rails project, I'm waiting for tests upwards of an hour per day (yes, I know guard, fancy runners with pattern matching etc).
The last reason, for me, is editor/IDE integration: Rust (and TS) type system make discovery, autocomplete and even co-pilot so much more useful that my productivity tanks the moment I'm "forced" to use my IDE with only solargraph to help.
And debugging: sure! I've had reasonable success with gdb and ruby debuggers in the past. Rust's gdb isn't much better. But stepping through a stack in a rails project is a nightmare: the stack is often so ridiculous deep (but it does show how elegant and neat it's all composed!) that it's all noise and no signal. Leaving a binding.pry or even `throw "why don't we get here?!"` also works, but to call that "productive" debugging is a stretch, IMO.
I like strong typing as well, and worked with a strongly typed language for years before Ruby.
Then I did Ruby+Rails fulltime for 9 years. Just recently moved on.
With Ruby, I'd have to write hundreds of
edge-case unit-tests just to cover stuff that,
with Rust is enforced compile-time for me.
Never a problem for me.
It was one of my major concerns about Ruby, prior to starting out. But like... it just wasn't a problem.
It turns out that we just don't pass the wrong kind of thing to the other thing very often, or at least I and my teams did not. It certainly helps if you follow some sane programming practices. Using well-named keyword arguments and identifiers, for example.
# bad. wtf is input?
def grant_admin_privileges(input)
end
# you would have to be a psychopath to pass this
# anything but a User object
def grant_admin_privileges(user:)
end
Of course, this can be a major problem if you're dealing with unfamiliar or poorly written code. In which case, yeah, that sucks. I know that many will scoff at the old-timey practice of "use good names" in lieu of actual language-level typing enforcement, and that "just use a little discipline!" has long been the excuse of people defending bad languages and tools. But a little discipline in Ruby goes such a long way, moreso than in any language I have ever used.
With Ruby, I'd have to write hundreds of edge-case unit-tests
just to cover stuff that, with Rust is enforced compile-time for me.
Well, you do need test coverage with Ruby. But you do anyway in any language for "real" work, soooooo.
I strongly dispute that you need extra tests for "edge cases" because of dynamic typing. Something is deeply wrong if we are coding defensive methods that handle lots of different types of inputs and do lots of duck typing checks or etc. to defend themselves against type-related edge cases.
(yes, I know guard, fancy runners with pattern matching etc).
Yeaaaaaah. Rails tests hit the database by default, which is good and bad, but it is inarguably slowwww. I don't find pure Ruby code to be slow to test.
The last reason, for me, is editor/IDE integration
Yes. I still miss feeling like some kind of god-level being with C#, Visual Studio, and Resharper. I liked the Ruby REPL which offset that largely in terms of coding productivity but was certainly not a direct replacement.
But stepping through a stack in a rails project is a nightmare
Yeah. I always wanted a version of the pry 'next' method that was basically like, "step to the next line of code but skip all framework and Ruby core code"
Rails proper, yes. Small rails app are generally drop-in compatible, but sizeable applications are likely to run in a few compatibility issues as most gems aren't tested against TruffleRuby.
> I wonder how much TruffleRuby would improve the performance and memory footprint.
The generally speaking Truffle is much faster at "peak" performance, but take very long to get there which makes it challenging to deploy.
It also uses way more memory, but it's partially offset by the fact that it doesn't have a GVL, so you get parallel execution with threads.
Ruby atm is working towards implementing true parallell execution with Ractors for example, and now with YJIT, the performance might increase some more.
I'm probably misinterpreting the numbers, but it sounds like the 3.3 interpreter also got some significant performance improvements - if 3.3 YJIT got a 13% speedup compared to 3.2 YJIT and a 15% speedup compared to 3.3 interpreter, that sounds like the 3.2 YJIT has only slightly better performance than the 3.3 interpreter. Is that interpretation correct? If so, what were the improvements in the 3.3 interpreter, or was 3.2 YJIT just not much of a speedup?
For 3.2 there also was an improvement of the interpreter:
> We now speed up railsbench by about 38% over the interpreter, but this is on top of the Ruby 3.2 interpreter, which is already faster than the interpreter from Ruby 3.1. According to the numbers gathered by Takashi, the cumulative improvement makes YJIT 57% faster than the Ruby 3.1.3 interpreter.
The 15% is for the total request time including waiting for blocked IO.
> All that work allowed us to speedup our storefront total web request time by 10% on average, which is including all the time the web server is blocked on IO, for example, waiting for data from the DB, which YJIT obviously can't make any faster.
Not to be pessimistic, but does this matter? Rails apps take 2-3x more resources to run than most other language stacks, including other dynamic languages, (including Perl!).
That’s not easy to answer, because it’s not quite an apples to apples comparison if you start factoring in libraries, frameworks and the specific workload.
My rule of thumbs:
Python has similar performance characteristics as Ruby.
With Java/C#/Go you’d expect about an order of magnitude of improvement.
With naive Rust/C++ you would likely be at the same average speed as Java for web applications but with less memory usage. Well until you make an effort to produce faster code.