| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by urbanautomaton 4953 days ago

I see this assertion increasingly frequently, and it puzzles me: that if we're not currently using an object to encapsulate state, we should prefer class methods. What's the justification? Other than four characters saved (".new"), what's the benefit in this approach?

State encapsulation is just one feature of objects. We may not be using it right now, but the only thing you achieve with the above code is removing future flexibility. It costs us nothing to allow for the future possibility of encapsulated state, so why rule it out? Adding complexity when you don't yet need it, fine, I quite understand objecting to that; but here you're actually putting in effort to make a future modification more difficult.

Funnily enough, Code Climate's previous blog entry was on precisely this topic, and is worth a read:

http://blog.codeclimate.com/blog/2012/11/14/why-ruby-class-m...

For what it's worth, I don't see the point in instantiating a whole new spam checker for every piece of content, so I'd probably change the OP's example to read:

    class UserContentSpamChecker

      TRIGGER_KEYWORDS = %w(viagra acne adult loans xrated).to_set

      def is_spam?(content)
        flagged_words(content).present?
      end

      protected

      def flagged_words(content)
        TRIGGER_KEYWORDS & content.split
      end
    end

If I really really wanted access to a default spam checker via a global constant I can always add the following:

    class UserContentSpamChecker
      def self.is_spam?(content)
        new.is_spam?(content)
      end
    end

At least then if my UserContentSpamChecker class ever has to change (perhaps it starts to use an external spam-checking service that's injected through the constructor), then I only need to change code in one place. And other clients that might want to inject a different spam-checking service (or a test double) are perfectly able to.

1 comments

zimbatm 4952 days ago

I think that we agree on the bottom-line: UserContentSpamChecker has no need for a state and is just some kind of namespace. That's the point that I wanted to get trough.

Then we can argue on the best way to handle that namespace. I'm not particularly fond of the class method approach neither but I don't think that your approach is appropriate either. A namespace should be instantiated only once in my opinion, that's why I'm going for the singleton object. Maybe the problem is with ruby and it should provide another mechanism for managing namespaces ?

    UserContentSpamChecker = namespace do
      TRIGGER_KEYWORDS = %w(viagra acne adult loans xrated).to_set

      def is_spam?(content)
        flagged_words(content).present?
      end

      protected

      def flagged_words(content)
        TRIGGER_KEYWORDS & content.split
      end
    end


    class MyOtherClass
      import :spam_checker, UserContentSpamChecker
    
      def foo
        spam_checker.is_spam?(content)
      end
    end

link

vidarh 4952 days ago

Keep in mind that in Ruby, classes and modules are just ordinary objects that happens to be instances of the classes Class and Modules respectively.

So for all practical purposes, if you define a module it is not much different than if you define a class, and then instantiate a single object (it is slightly different in that your object will be an instance of your class rather than of the class Class).

Modules are namespaces for Ruby (and pretty much only differs from classes in that you can't create instances of modules)

What you describe above is done with modules:

    module UserContentSpamChecker
        def is_spam?(content)
            ...
        end
    end

    class MyOther Class
         include UserContentSpamChecker

         def foo
             is_spam?(content)
         end
     end

If you want to be able to alias it, you'd do it with a method:

    module UserContentSpamChecker
        def self.is_spam?(content) # note the "self." to define a method callable on the UserContentSpamChecker object itself (of class Module)
            ...
        end
    end

    class MyOther Class
         def spam_checker; UserContentSpamChecker; end

         def foo
             spam_checker.is_spam?(content)
         end
     end

(or you could do it with a class variable or class instance variable - example class variable:)

     class MyOtherClass
          @@spam_checker = UserContentSpamChecker

          def foo
              @@spam_checker.is_spam?(content)
          end
     end

link

zimbatm 4952 days ago

I find that modules are good to add behaviour to an object like Enumerable but I don't find them practical as a namespace holder.

Including is an all or nothing operation. In your first example #is_spam? is now also a public method of MyOtherClass. The other issue with module includes is that method name collisions are also much harder to debug.

I also like your second example but I think it would be clearer if :spam_checker had a dedicated semantic. Something like:

    class Module
      def import(name, obj); define_method(name) { obj }; protected(name); end
    end

link

urbanautomaton 4952 days ago

> I think that we agree on the bottom-line: UserContentSpamChecker has no need for a state and is just some kind of namespace.

I'm not sure that we do. I agree that the current implementation of `UserContentSpamChecker#is_spam?` doesn't need state, but I don't agree with your conclusion that this distinction should be made obvious to clients, who surely just care that they have a thing that will check for spam, to which they can pass content. They don't care if it's stateful or not, as long as it accepts strings and returns booleans. Why are we trying to tell them that this particular method is stateless?

After all, even that offers a false guarantee. In your implementation, `#is_spam?` is just another method on an object instance - in this case an instance of class Module, referred to by the global constant UserContentSpamChecker. I can happily use instance variables in such a method:

    module Stateless
      extend self
      def no_state_here!
        @thing ||= 0
        @thing += 1
      end
    end
    
    > Stateless.no_state_here!
    => 1
    > Stateless.no_state_here!
    => 2
    > Stateless.no_state_here!
    => 3

Your implementation resists future refactoring, forces clients to create a hard dependency on a global constant, and doesn't make the guarantee you intend it to convey. We agree that the current implementation doesn't need an object instance, but can you explain what is actually better about avoiding one? Ruby is an object-oriented language, after all; objects are its common currency. It's not like we're introducing the Strategy pattern for a four-line method, we're just using Ruby as she is wrote. :-)

p.s. sorry vidarh, I see you've covered some of this already - serves me right for half-composing a reply then wandering off...

link

zimbatm 4952 days ago

It's all about the semantic. As a client I like to know if a method is purely functional or if it's going to have side-effects because I care to know if the operation is going to be re-entrant or not. Also methods with side-effects have performance implications if they use I/O. The semantic should be used to convey that message.

Obviously if my future spam checker is going to be introducing a side effect (eg: Akismet) I am going to refactor my code a no longer user a class method. In any case I don't see how it would "resists future refactoring". Grepping for UserContentSpamChecker is not exactly rocket science.

link

urbanautomaton 4952 days ago

> As a client I like to know if a method is purely functional or if it's going to have side-effects because I care to know if the operation is going to be re-entrant or not.

But your implementation doesn't convey this information. This simply isn't a guarantee you're going to obtain with Ruby; you have to inspect the code. A class method can do pretty much anything it likes; it could rewrite Object#method_missing if it wanted to. It can certainly store state, as I demonstrated. What's worse, it's global state, because every client is accessing a shared instance via a global constant.

As for refactoring, search and replace isn't difficult (although it is error-prone); the important point is that it's introducing unnecessary change. All that's changing is the internal implementation of your spam checker, and yet that change propagates to every single place your spam checker is used, all because of the way you implemented it originally. This doesn't seem like a good trade-off to me, given that we're not even obtaining a functional guarantee in return.

link

zimbatm 4951 days ago

It feels like we're on totally separate universes.

I don't understand why you insist that shooting yourself in the foot is any argument. My goal is to convey a semantic to NOT have to read the code; obviously if we don't follow the same conventions it won't work.

I also don't understand how something being global is bad. A namespace is exactly that: global. It's like you would argue to not associate classes to constants because it makes them global. The issue comes when your class methods have side effects like Time.now or User.find because they make your tests harder to build. But that's not what I'm proposing.

Finally I understand the value proposition that you have regarding refactoring: if you change the implementation and keep the same interface then you don't need to touch the lines where the interface is used. But introducing a side effect rarely comes without parameters and because your lines look like SpamChecker.new(foo).spam? there is no way to introduce the new parameters without either changing the lines in question or using a global. As an example: using Akismet would require an API key. How do you introduce it without touching your interface ?

link

urbanautomaton 4951 days ago

> It feels like we're on totally separate universes.

It does indeed. Never mind, it's happened before and will no doubt happen again. :-)

If all you're proposing is a coding convention, then I don't see why you're surprised/annoyed that the OP doesn't obey it; it's vanishingly rare. You're placing a significant structural constraint on your code that conveys only optional semantics, and ones that will apply to virtually no third-party code. Why not adopt a more rubyish convention and use bangs to denote methods that do have side-effects (c.f. String#gsub vs. String#gsub!)? This way you're not imposing any structural constraints on your code, only a minor naming restriction.

Re: refactoring, constructor params are only one possible change we might make. Our checker might store previously-seen content in an ivar, so that the first time it sees the content "hey sexy, why not call me?" it allows it to pass, but if it sees 1000 such messages in quick succession, it flags them as spam.

However: yes, changes to object instantiation will require changes to client code (although, as I pointed out earlier, it's easy to provide access to a default instance via a class method, providing both convenience and flexibility). In something other than a Rails application this would be mitigated using dependency injection, so that merely being a client of a particular object does not mean you have to know how to instantiate it. Unfortunately as Rails monopolises object construction for its own infernal purposes this isn't really possible, so you end up with unpalatable things like this knocking around all over the place:

    class GuestBookEntry < ActiveRecord::Base
      def spam_checker
        @spam_checker ||= UserContentSpamChecker.new(...)
      end
    end

While not ideal, this at least localises changes to one site in each client class, and one whose concern is directly related to the change you're making.

link