For anyone curious, we've been working to reduce the memory overhead and have added some stats to keep track of memory usage over time. On this graph, you can see a comparison with the CRuby interpreter:
Yes. Prior to that point we used to allocate a large chunk of executable memory upfront. We switched to mapping that memory on demand, and that alone was a huge improvement.
Edit: I guess this (https://github.com/ruby/ruby/pull/5944) PR was merged.