Hacker News new | ask | show | jobs
by wycats 3687 days ago
The Rust version does a type coercion from Ruby VALUE to Rust String. The type coercions are defined generically using Rust traits (see the Helix README) so once somebody defines RubyString -> String once everyone benefits.

In this case, the coercion needs to ask Ruby for the encoding tag and ask Ruby to validate the encoding (which is does often enough that it's often cached) but after that we can safely coerce directly into a UTF8 string.

If we wanted to support other encodings, we could fall back to using Ruby's transcoding support (string.encode("UTF8")) and again, once someone does the work once it'll work for all helix users.

1 comments

I was just pointing out that the C and Rust versions provided weren't quite equivalent.
You are definitely correct, this is definitely a bug.

Helix is setup to do the right thing – it already goes through a coercion protocol, we can easily add the encoding check there. We just missed that detail when porting the code, will fix it soon.

I suppose that echoes my point about how system programming in is hard to get right, there are just too many details you have to remember!

This is why having a shared solution like Helix is beneficial. By moving all the unsafe code into a common library, it's more likely that someone will notice the problem and fix it for everyone.

This actually touches on an interesting point I would like to elaborate on. When we say {Helix/Rust/Ruby} is safe, there is an important caveat – {Helix/Rust/Ruby} themselves could of course have bugs. I have definitely experienced segfaults on Ruby myself.

While true, this caveat is not particularly interesting. It is not a slight of hand. Moving code around doesn't magically remove human errors, that's not the point. It's about establishing clear boundaries for responsibility. (This is why unsafe blocks in Rust is great.)

When you get a segfault on Ruby, you know for certain that your code is not the problem. Sure, you might be something weird, but it is part of the contract that the VM is not supposed to crash no matter what you do. As a result, memory safety is just not a thing you have to constantly worry about when programming in Ruby.

It is the same thing as saying JavaScript code on a website "cannot" crash the browser, segfaults in user-space code "cannot" cause a kernel panic or malicious code "cannot" fry your chip. All of these could of course (and do) happen – but from the programmer's perspective, you can work with the assumption that they are not going to happen (and when they do, it's someone else's fault). It's not "cannot" in the "mathematically proven" sense, but it's just a useful abstraction boundary.