Hacker News new | ask | show | jobs
by jashmatthews 2945 days ago
Sinatra + Sequel is already very competitive in web performance with Go + Gin[1]. It's the Rails convenience stuff which slows things down massively. MJIT could probably bring Rails in line with Sinatra though.

Between Ruby 1.8 and 2.5, performance has improved around 13x in tight loops[2]. The Rails performance issue has been massively overblown since 1.9 was released.

Ruby 1.8 was a tree walking interpreter, so the move to a bytecode VM in 1.9 was a huge leap in performance. Twitter bailed to the JVM before moving to 1.9. A lot of those 10-100x performance differences to the JVM are gone thanks to the bytecode VM and generational GC.

Bytecode VMs all have the same fundamental problem of instruction dispatch overhead, they're basically executing different C functions depending on input.

Doing _anything_ to reduce this improves performance dramatically, even just spitting out the instruction source code into a giant C function, compiling it, and calling that in place of the original method. Another 10x improvement on tight loops should not be a problem.

[1] https://www.techempower.com/benchmarks/#section=data-r15&hw=...

[2] https://github.com/mame/optcarrot/blob/master/doc/benchmark....

11 comments

I would have never thought a ruby stack would come anywhere close to Go performance. The path to optimization used to mean abstracting the really crazy parts into a Go microservice for things that just needed absurd responsivity; but it's clear now that a slim ruby stack could also be very effective - and without needing to learn a new language. Worthwhile to at least explore before going Go.

Nor did i know that twitter jumped out of rails before ruby got performant. Which means the argument that twitter outgrew rails isn't so correct anymore.

still, thanks for this insightful comment.

>Nor did i know that twitter jumped out of rails before ruby got performant. Which means the argument that twitter outgrew rails isn't so correct anymore.

Twitter, even back in those days would have still outgrown today's Rails. It was Ruby that has gotten a lot faster. Not necessarily Rails.

From what I understand, the biggest issue was their product required fast fan out messaging. Tumblr, for example, is still huge but can get away with 1000 lines of PHP for their feed: https://news.ycombinator.com/item?id=17154403
> It's the Rails convenience stuff which slows things down massively.

Yeah no kidding.. https://samsaffron.com/archive/2018/06/01/an-analysis-of-mem...

Or Specifically Active Record. Are we going to get Turbo Record soon? :)
ActiveSupport also monkey-patches to_json with a recursive Ruby function, completely nerfing JSON performance by 50x in a lot of cases. Unfortunately, it's what makes 'render json: @model' just... work.
Any time I personally ever want to generate JSON in Ruby, I want recursive tree traversal. It's just too painful to expect primitive types on everything. You'd have to pay that 50x doing it explicitly with a recursive function before you turn it into JSON anyway.

Ideally, you'd serialize directly from the database, bypassing the application entirely. Easily doable in ActiveRecord, but it's an explicit action, not the default. Not even sure if it's available in other databases besides PostgreSQL.

It's not required at all! Rails adds a completely avoidable massive overhead because of the way it overrides to_json see: https://twitter.com/jashmatthews/status/967423661908070401
Have you ever used Ruby's JSON.dump? The second it runs into anything that it can't convert, you get "#<Object:0x00007fab05133078>" everywhere in your output.
Thanks for the informative response (and sudhirj too)! Haven't looked at that benchmark for a while - very interesting. Sinatra is absolutely killing it. I knew it was faster than Rails but not that much faster.

With 2.6 and sorbet [1] coming down the line, it's exciting to be a Rubyist again!

[1] https://sorbet.run/

>Doing _anything_ to reduce this improves performance dramatically

It does if you ignore the overhead of JIT compilation itself. However, my understanding is that writing a JIT implementation that performs better than a good interpreter is surprisingly difficult. You have to have a lot of complicated logic for tracking hotspots and using JIT judiciously in short-running scripts.

Hmmm, not really. I think that's somewhat true for JS and webpages, but they're not the same as server-side apps.

It's the instruction dispatch overhead that's the real unavoidable problem. LuaJIT, for example, uses a bunch of tricks to minimize it in the bytecode VM, and it's significantly faster than the standard Lua VM but still far, far slower than basic JIT compilation.

Right, but historically there are lots of instances of projects that abandoned JITs because they didn't get a performance improvement. JIT compilation reduces instruction dispatch overhead, but it also, unless accompanied by sophisticated profiling techniques, adds the overhead of JIT compilation time, which can easily swamp the improvements.

Lua JIT is one of the most sophisticated dynamic language JITs out there, so it's hardly evidence that a simple implementation of a JIT will perform better than a good bytecode interpreter.

The problem is less acute for server side apps because the programs run for a long time, so that the initial compilation overhead is insignificant. However, there's a reason that you need a JIT to make Ruby fast rather than an ahead of time compiler. Ruby has so few compile-time guarantees that you need to do a lot of dynamic specialization to get really significant performance improvements. So compilation might still be triggered even after a script has been running for a long time.

I'd add that PyPy, which is also very sophisticated, is often not much faster than CPython, and in fact is slower for some types of code. Writing good JIT-based implementations for dynamic languages is really a tough problem. See e.g. the following post for some explanation of why:

http://faster-cpython.readthedocs.io/notes_2017.html

> Right, but historically there are lots of instances of projects that abandoned JITs because they didn't get a performance improvement. JIT compilation reduces instruction dispatch overhead, but it also, unless accompanied by sophisticated profiling techniques, adds the overhead of JIT compilation time, which can easily swamp the improvements.

Yes.

> Lua JIT is one of the most sophisticated dynamic language JITs out there, so it's hardly evidence that a simple implementation of a JIT will perform better than a good bytecode interpreter.

I meant that even a basic JIT can offer the same speedup as LuaJIT's interpreter, and a lot more work went into the latter.

> The problem is less acute for server side apps because the programs run for a long time, so that the initial compilation overhead is insignificant. However, there's a reason that you need a JIT to make Ruby fast rather than an ahead of time compiler. Ruby has so few compile-time guarantees that you need to do a lot of dynamic specialization to get really significant performance improvements. So compilation might still be triggered even after a script has been running for a long time.

The initial results of MJIT for simply removing the instruction dispatch overhead and doing some basic optimizations are a 30-230% performance increase on a small but real-world benchmark. No type specialization and specular optimization required.

> I'd add that PyPy, which is also very sophisticated, is often not much faster than CPython, and in fact is slower for some types of code. Writing good JIT-based implementations for dynamic languages is really a tough problem. See e.g. the following post for some explanation of why:

Most of the discussion about PyPy is completely irrelevant for the discussion about MJIT. PyPy isn't a method JIT. PyPy traces the interpreter itself and tries to produce a specialized interpreter. It works even worse at optimizing Ruby code via Topaz.

> It works even worse at optimizing Ruby code via Topaz.

Topaz was easily the fastest Ruby JIT before TruffleRuby, beating the JRuby and Rubinius JITs. It was very impressive.

True! I just checked again and Topaz is indeed almost twice as fast as CRuby on optcarrot. I think I got it mixed up with the non-JIT Rubinius numbers.

It'a shame Topaz was never really "finished".

>The initial results of MJIT for simply removing the instruction dispatch overhead and doing some basic optimizations are a 30-230% performance increase on a small but real-world benchmark. No type specialization and specular optimization required

So, this amounts to a small improvement for some types of code. Indeed, it is "easy" to get that by "just" using some basic JIT techniques. The trick is to get consistently better performance across the board. Relevant tweet at https://medium.com/@k0kubun/the-method-jit-compiler-for-ruby...:

>I've just committed the initial JIT compiler for Ruby. It's not still so fast yet (especially it's performing badly with Rails for now), but we have much time to improve it until Ruby 2.6 (or 3.0) release.

> The trick is to get consistently better performance across the board.

This will come with the rest of the opimizations Takashi has planned for Ruby 2.6. Ruby-Ruby method inlining, which is almost finished, is a huge one for improving Rails performance. IMHO there's no real point talking about Rails until it's working in some form.

> >I've just committed the initial JIT compiler for Ruby. It's not still so fast yet (especially it's performing badly with Rails for now), but we have much time to improve it until Ruby 2.6 (or 3.0) release.

It turned out this wasn't even testing MJIT with Rails because https://twitter.com/samsaffron/status/963219086833434624

Yes, I agree with this. We use Sinatra + Sequel, but run our code using JRuby mostly because we're sharing some scala libs with other teams. In any case for us, performance has not been a problem (~20k req/min with one node). I'm really looking forward to ruby 3x3 :)
JRuby is really hitting its stride now, becoming ~3x faster than CRuby in my testing. The JVM changes to support more dynamic languages have improved performance so much.

Charles Nutter's early tests using JRuby on the GraalVM sound like there's another big step in performance coming without a huge amount of work.

Ruby's GIL will get in your way when working with real world apps. Also in this recent run gin doubles Sinatra's performance

https://www.techempower.com/benchmarks/#section=test&runid=a...

Interesting! Thanks.

CRuby's GIL doesn't really matter for serving web requests since it's run with one process per core like NodeJS is. It's less memory efficient but doesn't really affect throughput so much. Also, JRuby has no GIL.

I read from benchmarks Iris was the fastest web framework ever. How does Gin compare to it?

Another big win is the bootsnap gem, which is a cache of previous VM runs that loads faster than parsing all invariant pieces of code again.

https://github.com/Shopify/bootsnap

Is Idris fully featured? It's better to compare two frameworks that have real production use in more complex apps to get an idea. Both Gin and Sinatra fit this description.

I haven't had a chance to use Bootsnap yet but it sounds really promising.

Ruby has made huge strides to be sure. However this is a bit hyperbolic IMHO.

Golang plus gin, sure. However there are other Go frameworks on the charts that blast the Ruby competition out of the water. Ruby isn't really on the podium at all with C, C++, Rust, Golang, C#, and Java about an order of magnitude out in the lead on fortunes.

Martini isn't much of a framework itself either, so lets forget the full featured nonsense. Almost none of the ecosystem is in play with these benchmarks. You could build a system up around fasthttp just as well as net/http, and ASP.NET certainly can't be accused of being a for-purpose contender.

The most impressive thing IMHO is how well Ruby is doing on maximum latency. I can't quite reconcile that considering fasthttp is pretty much zero-allocation and golangs stop the world is in the microseconds.. Pretty impressive.

> The most impressive thing IMHO is how well Ruby is doing on maximum latency. I can't quite reconcile that considering fasthttp is pretty much zero-allocation and golangs stop the world is in the microseconds.. Pretty impressive.

Fast GC is critical to Ruby performance so a ton work went into it. Ruby 2.2+ has a very short STW phase thanks to generational GC + incremental marking.

What I remember reading from the Graal folks a while back was that Rails performance issues revolves around the amount of object creation and destruction.
Thanks for this detail! Was just starting to look at Go... Had no idea Sinatra was that fast.