Hacker News new | ask | show | jobs
by throwaway_62022 789 days ago
>The Ruby implementation has a subtle mistake which causes signficantly more work than it needs to.

To be fair, I do not think that is a "mistake" as such. I have written Ruby professionally for 6 years or so and have committed to several Ruby open source projects and haven't seen an innocus `nil` sitting at the end of a loop, to prevent array allocation.

The argument would be fair, if it wasn't idiomatic Ruby.

More like - knowing internals of a language will allow one to gain more performance out of it. That has been true for almost every programming language, but general speaking the goal of a VM based language is to not require that _specialized_ knowledge.

4 comments

> if it wasn't idiomatic Ruby

It's idiomatic Ruby in a very particular case that likely was explicitly chosen to demonstrate such a dramatic effect.

You're _usually_ not returning implicit arrays from loops in production code. Parallel assignments, when they're used, are almost always in the first line of an initialize method, not the returned line of an enumerable block.

I don’t think there’s any language, interpreted or otherwise, with the goal that knowing its internals won’t help you gain more performance. I mean, that would be nearly impossible.

Ruby code, perhaps more than most code, is written for readability and “beauty”. It’s a part of Ruby culture that I greatly appreciate. But if you care about performance, you will act differently, regardless of language. And the whole point of this code is to show that if you care about performance above all else, there’s of plenty of room to maneuver in interpreted Ruby.

That's an interesting question, how does the YJIT perform using the original code? Does it find the optimizations that it results in the same gain such that you don't actually need to personally know the optimization?
On my machine, the YJIT version of the original code is only ~30% faster than the non-YJIT version

    ~/scripts > ruby fib.rb                                                                                                                                                                                                                                                                     
    2.3346780000720173
    ~/scripts > ruby --yjit fib.rb                                                                                                                                                                                                                                                           
    1.5913339999970049
So looks like YJIT doesn't "know" about this optimization
Ah thanks so much for trying it out. Interesting that it couldn't figure out that code path
I think it's just a matter of time. YJIT is still fairly young and doesn't do extensive inlining at the moment. If it did inline the block it could see the array is unused and avoid the allocation.

Running the original fib benchmark (i.e., without the author's technique to eliminate the array allocation) on an M1 Pro, I see:

  CRuby 3.3.1:
  2.058589000022039

  CRuby 3.3.1 w/ YJIT:
  1.4314430000958964

  TruffleRuby 24.0.1 (Native):
  0.20155820800573565

  TruffleRuby 24.0.1 (JVM):
  0.1336908749944996
I took the best time out of three for each implementation, but there wasn't that much variance over all. Standard caveats about benchmarking on an actively used laptop apply.

Running the new prime_counter benchmark that the crystalruby author mentions in another thread¹, I see:

  Crystal 1.12.1 (LLVM 18.1.4) w/ crystalruby 0.2.0 in CRuby 3.3.1:
  0.34096299996599555

  CRuby 3.3.1:
  2.9615250000497326

  CRuby 3.3.1 w/ YJIT:
  1.640430000028573

  TruffleRuby 24.0.1 (Native):
  0.2504862080095336

  TruffleRuby 24.0.1 (JVM):
  0.25282600001082756
YJIT and TruffleRuby make different trade-offs, so I'm not trying to say the latter is necessarily better. But, I think the TruffleRuby numbers show what are possible in terms of Ruby optimization. Unfortunately, there's currently an issue in TruffleRuby with one of the crystalruby gem's dependencies³, so I had to extract the Ruby benchmark out to a separate file. incompatibility.

¹ -- https://news.ycombinator.com/item?id=40153218

² -- The method_source gem used by crystalruby catches exceptions and matches against the message² for some conditional handling. TruffleRuby 24.0 now uses to Prism as its parser and Prism has an exception message with slightly different wording from CRuby. Consequently, method_source's handling doesn't work with Prism. It's hard to say where the compatibility issue lies, since exception messages aren't stable APIs. We'll get it sorted out.

³ -- https://github.com/banister/method_source/blob/06f21c66380c6...

> TruffleRuby 24.0.1 (JVM): > 0.1336908749944996

That's impressive numbers for running the unoptimized code. I might give TruffleRuby a shot!

>I don’t think there’s any language, interpreted or otherwise, with the goal that knowing its internals won’t help you gain more performance.

It is, indeed, a fundamental goal of ruby that there are multiple ways to write the same thing, and that the programmer should not need to understand nuances of the compiler.

"I need to guess how the compiler works. If I'm right, and I'm smart enough, it's no problem. But if I'm not smart enough, and I'm really not, it causes confusion. The result will be unexpected for an ordinary person. This is an example of how orthogonality is bad." -matz 2003.

That's about semantics, not performance. I don't think Matz would say that a goal of Ruby is to prevent you from improving performance using whatever knowledge you do have about the compiler.

In this particular example, the fact that assignments return the value of the right-hand side is well-known and used frequently in Ruby code. The fact that arrays have to be allocated is obvious. The fact that allocations have a runtime cost is obvious. The only thing that isn't obvious is that the return value allocation of assignments whose value isn't used are optimized away. If you know that, you'll think of appending the nil to activate that optimization. Characterizing the lack of that step as a "mistake" only makes sense if the goal for your code is to maximize performance -- which in this case, most unusually for Ruby, it was.

+1. I love golang, because for the most part, there is only 1 way to do something. With ruby, there are a billion ways to do the same thing, with some being slower than others.
I've just started learning Go as a very long time Rubyist. I really enjoy both languages for very different reasons. In Ruby, I can write code that really makes me happy to read. Enumerable is just wonderful. You can go a long way in Ruby without writing a single if statement. It's great. If I'm working on a solo-project, it's the language I'd choose every time. But working with inexperienced or people who "know" Ruby, but never adopted "the Ruby way" is a nightmare. Ruby code, written poorly, can be extremely brutal to follow. When the great deal of freedom Ruby offers isn't handled responsibly, a hot mess can ensue.

Go is the opposite. It's great, as you say, because it's dirt simple. It's a brutalist get-the-job-done kind of language, and I think if I were to start a company working with other engineers, I'd absolutely choose Go for that reason. It's easy to read. It's easy to reason about. And there's very little implicitness in it.

    with some being slower than others.
Do we all agree that in practice, these sort of micro-optimizations almost never matter?

It's certainly easy to think of situations where they do matter, but unless your project is FaaS (Fibonacci As A Service) probably not.

I've seen that more than a few cases where people used the wrong data structure (like array, for O(N)) look ups, instead of a hash.

All the inefficiencies add up, at scale. a 3% inefficiency means you're spending ~3% more on compute. CI takes longer. Dev velocity decreases.

> +1. I love golang, because for the most part, there is only 1 way to do something.

Are you aware that you're referencing the Python mantra with that? Feel free to Google it, it's from 2004.

There should be one-- and preferably only one --obvious way to do it.

    the Python mantra with that?
Offtopic but, I switched from Ruby to Python for a new job about six months ago.

While I love Ruby, I was looking forward to some of that "one way to do things" simplicity I'd been promised by Python.

Boy... were my hopes crushed. There are a lot of possible ways to do any given thing, from iteration to package/import structures, etc.

For the most part it seems like the proliferation of options in Python is pretty sane; I can generally see why each choice was made to address existing pain points. So kudos to Python for that. But man, did they leave "one way to do it" behind a loooong time ago.

That they did, indeed.

And they honestly never really adhered to it either. It was just a knee-jerk reaction to a (I believe Haskell) presentation that said something like "we got n ways to do x", where n was in the double digits and x was something extremely uninteresting like looping over an array.

Can't really recall details though, it was old knowledge by the time I Heard about it around 2010 and a quick Google didn't help me dig it up

How are there companies with Ruby source code making enough money to hire full time Ruby devs?
Ask GitHub, Shopify and Stripe.
Presumably by streamlining development so they can quickly deliver functionality to paying customers.