Hacker News new | ask | show | jobs
by theseoafs 4671 days ago
Well, the compile-to-C thing is one thing. It's possible to compile any language down to C. You could compile Brainfuck or Python down to C if you wanted. The question is how complicated the resultant code will end up being compared to pure C. I would only call Ruby a "compile-to-C" language if it could generate code that is at least somewhat comparable to sane C code. Here's some C code that sums all the integers in an array:

    long sum(int *a, int len)
    {
        long ret = 0;
        for (int i = 0; i < len; i++)
            ret += a[i];
        return ret;
    }
The entire loop (the conditional test, the increment, and the `ret` update) could probably be implemented in less than 10 native instructions depending on your machine. Faaaaast. If Ruby were a compile-to-C language, I would expect it to produce C code that looked somewhat like this. So let's look at the same snippet in Ruby:

    # sum the first len elements of a
    def sum(a, len)
      ret = i = 0
      while i < len
        ret += a[i]
        i += 1
      end
      ret
    end
(This is far from being idiomatic Ruby code, but this solution is the simplest and it also seems like it would be the easiest to directly translate to C.) Semantically, here's what that would translate to (in a C-like pseudocode):

    RubyObject *sum(RubyObject *a, RubyObject *len)
    {
         RubyObject *ret = newRubyInteger(0);
         RubyObject *i = newRubyInteger(0);
         while (call(getMethod(i, "<"), len)) {
             ret = call(getMethod(ret, "+"), call(getMethod(a, "[]"), i));
             i = call(getMethod(i, "+"), 1)
         }
         return ret;
    }
Why is this so complicated? Because I've captured Ruby's dynamic typing and dynamic dispatch within the function itself. `ret` isn't a long, it's a Ruby variable that can hold any type of object, so we need to capture that in the source. Same with `i`, `a`, and `len`. When we say `a[i]`, we're not jumping to the `i`th element of the integer array `a`, which would be super fast. Instead, we have to dynamically dispatch the `[]` method, which will perform bounds checking and a bunch of type-checking. We also have to dynamically dispatch the `<` and `+` methods everywhere, which perform type-checking themselves. Obviously, this all takes much, much more than 10 native instructions. You can't generally optimize out the method dispatches, since you are generally allowed in Ruby to redefine methods of built-in classes wherever you want. You might be able to perform some static analysis to get rid of some dynamic types, but you have to be careful with machine integers, since they overflow without warning. You'd have to check after every operation you do that the operation didn't overflow, and switch it out for a big integer if that happens. Any of these methods could raise exceptions and that's a nontrivial problem to deal with. The garbage collector is also running in the background.

And this is just a simple example, too. Things get a hell of a lot more complicated when you introduce blocks and dynamic scoping (which I purposefully stayed away from). So that should paint a somewhat clear picture of why it's not just an issue of waiting 10 years until Ruby gets as fast as C. I don't know how close it's even possible to get without messing with the semantics of the language.

1 comments

Ah, I see what you mean now. Ta.