Hacker News new | ask | show | jobs
by marcianx 683 days ago
I immediately see the logic in this API. When slicing, I look at indexes as being between elements or at the start (0) or end (length). This gives an in-bounds starting index between 0 and length, inclusive. So if the starting index is in bounds, you get a substring. If it's not, you get no result.

And your answer for Python is not quite correct: "" is falsy in Python, and both of the last two when translated to Python give "null".

3 comments

What I find weird here is the asymmetry: Ruby apparently allows the end index to be out of range, but not the start index. Contrast with, e.g., Rust's slice syntax, where both endpoints have to be in range, or else it will cause a panic.
> Ruby apparently allows the end index to be out of range, but not the start index.

What gives you that impression? "abc".slice(4, 10) is perfectly valid and accepted, assuming the code above is accurate.

Fine, it's 'accepted' in the basic sense that it doesn't throw an immediate error. But it also doesn't return any useful string, as Python would. So you'd need an extra step to feed its output into anything expecting a non-null string.

The inconsistency here is that when you call "abc".slice(2, 10) and get "c", Ruby has implicitly truncated the range to return whatever characters are available, even though it can't go all the way to 10 because the string isn't long enough. But then when you call "abc".slice(4, 10), it doesn't just give you all available characters from index 4 (which would be an empty string), it gives you null instead.

> The inconsistency here

I don't see the inconsistency. slice on Array works the same way. Where is the inconsistency?

> (which would be an empty string)

What other aspect of Ruby would suggest that it is an empty string?

If what you are struggling to say is that different languages are different, then okay. "Japanese is unlike the English I know and therefore is inconsistent" would be a rather bizarre take, though.

> Where is the inconsistency?

Between what happens when the start index is greater than the length of the input, and what happens when the end index is greater than the length of the input. If the end index is greater than the length of the input, it returns a string (as long as the start index is not greater than the length of the input). But if the start index is greater than the length of the input, it does not return a string: it returns null, which is not a string.

My suggestion is that the behavior would have made more sense if it either returned a string in both cases (i.e., if it returned a string even if the start index is greater than the length of the input), or returned null in both cases (i.e., if it returned null whenever the end index is greater than the length of the input).

> Between what happens when the start index is greater than the length of the input, and what happens when the end index is greater than the length of the input.

Again, what makes that an inconsistency and not just a different language?

> My suggestion is that the behavior would have made more sense

On the basis of the start and end indices being equivalent. But are they? What attributes of the language should see us consider them to be?

Underneath the hood that's a C string, and the four points to the null terminator, so it's indexable. And that's why you get an empty string if you point exactly one past the end.

That's why if you put five or more in for the first index it fails to produce a result entirely. I think I might I preferred an exception or a failure code being returned, but I can't say the current design is truly awful.

> That's why if you put five or more in for the first index it fails to produce a result entirely.

Where does this come from? Are these discrepancies stemming from different Ruby implementations/versions behaving differently? "abc".slice(5, 10) returns the same value as "abc".slice(4, 10) [which, curiously, does not return the same value as the original comment] under MRI 2.6.1 that I had handy.

I believe I got it from the book Ruby under the microscope. Which looked at what is now a really old version of Ruby, and if they changed it to make a API more consistent that's probably good.
I ask for a range whose start is in bounds but end is out of bounds.

I ask for a range whose start is in bounds but whose end is out of bounds.

Why should those return two entirely different types?

Apologies, you're correct. I didn't consider the ramifications of printing to the screen with an or. I was trying to get rid of the lack of distinction between `puts null` and `puts ""` in Ruby.

And yes, I also understand the logic of the API; but if you're used to using slice to protect against random NPEs and out-of-bounds exceptions - which is something I do and am used to being able to trust in as a general pattern.