Hacker News new | ask | show | jobs
by wdewind 3898 days ago
Cool article about the craziness of Ruby. Ruby is a frustrating language. The oauth gem, for instance, redefines '==(val)' on the AccessToken to 'Base64.encode(self.signature) == Base64.encode(val).'

This stuff feels really dangerous and unnecessary. I spend a lot of time on code reviews pointing out bad features of Ruby (and Rails) that we shouldn't be using because they break application flow and make it significantly harder to reason about the code for the small benefit of decreasing a few lines. But it's certainly fun to talk about :)

3 comments

Ruby is a wonderful (and wonderfully powerful) language. Unfortunately, some of the popular ruby libraries (gems) are... problematic. To put it nicely. The web/rails gems in particular can be nasty minefields.

However, the language itself is great. The trick is to remember the usual advice that just because you can doesn't mean you should. Too many gems add "clever" metaprogramming (such as the def tricks in this article), when it isn't actually making the program simpler.

The beauty of ruby is that while it can act almost like a LISP for the times when you want powerful metaprogramming features, while also allowing simple shell or C style imperative code when that is more appropriate.

(Of course, some people fear that type of freedom... https://vimeo.com/17420638 )

In its place, being able to redefine equality on value types really helps clarify code. Misused, it creates confusion. The problem is that it's easy to think you've got a case where it helps, when actually it's not well-defined. Usually that revolves around there being state you care about which is missed from the equality comparison.

Another trap is redefining #== without also looking at #eql?, which means Hash doesn't behave like you expect. It's just another bit of mental trivia you've got to Just Know...

Can you give me an example of where redefining equality makes sense?
Another (even more simple than Money) example are many of the standard library classes.

For example BigDecimal. With operator overloading, I can easily compare a BigDecimal object to an integer or float, the same way I can already compare them to each other:

    BigDecimal.new("10.0") == 10.0
    10 == 10.0
Without operator overloading, this would become needlessly messy (and require explicit handling of nils):

    d = BigDecimal.new("10.0")
    !d.nil? && d.value_equal_to(10.0)
Ruby has always been about readability of the code, and avoiding unnecessary repetition, and I think operator overloading (when used correctly) is a great example of this.
Anytime you have some value type. Say a `Money` class for instance.

The default `==` from Object compare identity, so unless you define it `Money.new(20, 'USD') != Money.new(20, 'USD')`.

Right, this seems like exactly why we shouldn't be allowed to redefine it. As a reader of your code that redefines ==, I think == means we are talking identity until I find your function that redefines ==. It has made it so I need to understand more things in order to be able to reason about your code. That seems like a negative to me.
But even in core Ruby, == is not always identity; hashes, arrays, ranges, all of them are compared by value and not by identity. The assumption that == is comparing identity is broken, not the code that implements it differently.
For "primitives" yes, for objects no. What would be primitives in Ruby extend Object, because everything in ruby does, but (sigh) they redefine == so they act more like primitives in other languages. At least there is a clear cut rule, but it's pretty much turtles all the way down.
That's because you are among the people who don't like abstractions. I disagree with that opinion, but that's fine, Ruby is just definitely not for you.

Don't try to change or complain about Ruby, you are likely more happy with languages like go.

> That's because you are among the people who don't like abstractions.

That's not the issue at all. Abstractions are great. Redefining operators is not abstraction, it's practically obfuscation.

If somebody wants object identity rather than semantic equality, they should be using `equal?`. The fact that different types have different equality semantics if just kind of inherent in the idea of a type.
Should Money.new(20, 'USD') == Money.new(2000, 'US Cents')?

Or Money.new(20, 'USD') == Money.new(125, 'CNY') when ExchangeRateManager.getExchangeRate('CNY', 'USD') == 0.16?

My point is that when performing these comparisons, it may be useful to use a more descriptive function like:

boolean currenciesHaveSameWorth(Money m1, Money m2)

And then a reader of the calling code might not have to look into the implementation to understand what the function is doing, whereas you definitely would when using == because == now means "whatever it's overridden to mean"

> Should Money.new(20, 'USD') == Money.new(2000, 'US Cents')?

Like someone else pointed out in this thread, designing APIs require consistency and good taste.

If I had to implement this API you code above would evaluate to `ArgumentError unknown currency "US Cents"`.

> Or Money.new(20, 'USD') == Money.new(125, 'CNY') when ExchangeRateManager.getExchangeRate('CNY', 'USD') == 0.16?

Again, me designing this API, it wouldn't be equal. Why? For the same reason `1 != "1"`, if you cast them, yes they are equal, but implicit casting (aka weak typing) is not idiomatic in Ruby, it's possible, but very rare.

> boolean currenciesHaveSameWorth(Money m1, Money m2)

At this point you might as well do `m1 == m2.convert_to(m1.currency)`, because "HaveSameWorth" might mean many different things too.

> At this point you might as well do `m1 == m2.convert_to(m1.currency)`, because "HaveSameWorth" might mean many different things too.

I personally hate that last style because it's obvious that the "HaveSameWorth" relation is intended to be symmetric, and by writing it like m1 == m2.convert(...) you're prefering one side over the other. It looks bad for me :).

Also, in case of real-life objects it makes sense to spell out what do you mean by 'equality' (or 'equivalency'), and leave the default implementation to represent the philosophical concepts of "the same" and "equivalent to".

Without explicit comments/documentation it is hard to imagine a good example. Its a longstanding issue. Even Lisp has 'equals' and 'equals?' which is an abomination. One looks for identity of object; the other for identity of value (if I understand it right). These kind of things are bug factories.
Kent Pitman has always been a good read, for the problems of equality in dynamic languages. The typical Ruby or JS programmer stumbles through the day just "getting by", where it comes to comparing objects.

http://www.nhplace.com/kent/PS/EQUAL.html

I think Python make the difference clear, with == vs "is".
Ruby provides object_id for the same purpose. Comparing those for equality provides the "is_" semantics.
Lisp does not have 'equals' and 'equals?'.

Lisp has 'eq', which is for object identity. Lisp has 'equal' which checks for structural equality.

and 'equals?'
My personal favorite: a safe, constant time equality method for cryptographic stuff.
Any value type.
> The oauth gem, for instance, redefines '==(val)' on the AccessToken to 'Base64.encode(self.signature) == Base64.encode(val).'

Is this just a complaint about the ability to overload operators in general?

It's a complaint about the ability to redefine many things that shouldn't be redefine-able, operators included.
Very right. Operator overloading is a self-indulgent trick. The only one to benefit is the author. Subsequent readers are mostly confused.

Instead of overloading, I've always wanted to define new operators e.g. <DotProduct> used as 'int x = v1 <DotProduct> v2;'

If you're going to do something with operators, at least let me make descriptive ones.

Ehh, there are good and bad uses. Yes, dot product should not be overloaded asterisk operator because elementwise multiplication of vectors is a thing, and also if you overload multiplication for matrices then dot product should be u-transpose-times-v for consistency. But what about vector addition? There's no ambiguity. Try writing any 3d graphics code without overloaded vector addition. It sucks.

I generally disagree with people who dislike a language feature only because of its abuse potential. Good programmers should not have to suffer for the sake of damage controlling bad programmers.

Edit: I do think it's stupid to make the identity-equals operator overloadable. Identity-equals and value-equals are separate concepts. In C++ this isn't an issue because == is the value-equals operator.

> Edit: I do think it's stupid to make the identity-equals operator overloadable. Identity-equals and value-equals are separate concepts. In C++ this isn't an issue because == is the value-equals operator.

In Ruby, == is value-equals at well (identity-equality being such a rarely needed concept in Ruby that it wouldn't make sense to privilege it with its own operator).

Identity equality in Ruby is provided by:

    a.object_id == b.object_id
For identity equality, there's also `Object#equal?`.

http://ruby-doc.org/core-2.2.3/Object.html#method-i-eql-3F

I disagree, operator overloading often makes the code much more readable. I don't see the problem unless you're trying to write C in Ruby. As long as the overloaded implementation keeps the expected semantics as described in the language documentation, it's fine.
Most problems people have with operator overloading seem to be caused by the same issue that makes them whine about "debugging metaprogramming" (I referenced that in other comment here). Namely, instead of trying to understand the code and the model behind it, they try to bring over their own assumptions about how the code should work.

That 5-component object compared by ==? How does it work? Sit down, read the code and find out. The answer depends on what exactly the object represents and what makes sense in the domain model.

That's exactly the issue. When its a small program that works fine. A large one? Its a matter of time and effort. Anything can be looked into, with time. Which is money. And effort over time ~mistakes.

Its too simple to do victim-blaming here. You don't understand my code? Well, just read it all so you know how clever I was.

If you want to write code that can be easily assimilated, which most readers would think they understood from its source and not by reading it at some meta-level, then you have to code with one hand behind your back.

Its not ever 'fine'. Its confusing at best. Imagine a component with 5 attributes. Does '==' match them all? Some of them? Loosely or tightly? I'm afraid just seeing '==' in the code is never going to be informative.

Instead, maybe a method MatchAttr1And2(v1, v2) would certainly tell a subsequent reader a little more about what's going on.

> Imagine a component with 5 attributes. Does '==' match them all? Some of them? Loosely or tightly? I'm afraid just seeing '==' in the code is never going to be informative.

This is, fundamentally, a disagreement about the value of encapsulation. With an opaque, encapsulated type, '==' should mean whatever makes the most sense in the context of that type. For a pointer that might be "equality" means "same memory address", whereas for a vector that might mean "equal components". As a user, "equality" should match an intuitive understanding of what it means for two things of this type to "be the same". It's an art form. Like many things in programming, doing it well requires good taste.

Operator overloading is a powerful technique for preserving encapsulation. It's the polar opposite of:

> Instead, maybe a method MatchAttr1And2(v1, v2) would certainly tell a subsequent reader a little more about what's going on.

This leaks implementation details like a sieve. It's a great recipe for encouraging dependency on a particular implementation detail across module boundaries, and rolling yourself a great heaping ball of mud.

If the code you're dealing with makes you care what the equality operator is doing internally, it's not well enough abstracted.
Sure, for some cases, == is not immediately clear. Don't use it in those cases.
> Operator overloading is a self-indulgent trick. The only one to benefit is the author.

That's a great way to put it.