There's another code smell that every programming whiz kid produces at some point: over-engineered.
All code has a cost, primitives have a lower base cost because they are universal in the language and thus every programmer will automatically know how to use them. Before introducing a custom object with yet another API to be learned, you need justification.
> Implement Money class if you need to deal with money, it’s much better than using floats all over the place. Don’t use Hash for configuration objects, use a custom Configuration class instead.
Let's pretend he said integer since floats for money is outright broken. Yes, Money is a clearcut case where having a money-specific API is both intuitive and immediately useful.
Configuration on the other hand is debatable. If you're just doing things a hash does then it's fine. If you find yourself with lots of helper methods or repeating the same transformations over and over then that's when you have a code smell that may warrant lifting configuration to its own level.
But the critical thing is to remember that there's a non-trivial cost to creating domain-specific objects.
Now that is in interesting proposition: Let's pretend he said integer since floats for money is outright broken. On my 32 bit machine, enquire.c tells me that long integers have 32 bits of unsigned value. It also says that doubles have 53 bits of precision. So if you decide to use floats for money, you get bigger numbers than if you use long.
Oh, and might I suggest that you consider doing it in pennies rather than fractional dollars? It will work out better.
Better to understand the underlying territory, I think. That would not be over-engineering.
The justification is simple, to allow for testing concerns the code uses. I see domain specific objects the same way I see domain specific languages - if I cannot grasp it by looking at it then it's broken, meaning I should not have to read documentation but simply understand the domain the object / language represents e.g sinatra get()... etc.
You state that as if it's a universal justification, but the same standard applies. If your code enough is simple enough to use primitives, then you're unit tests are testing the next higher level of abstraction. If the complexity rises to the level of justifying it's own class, then you should probably have unit test coverage on that.
Also, regarding DSLs, there is a tradeoff there that's worth commenting as well. A DSL pays the most dividends when the domain is understood (such as HTTP in the case of Sinatra), but if it's custom business logic than there may not be enough common business understanding for a DSL to be intuitive, and in that case it's just another layer of indirection to follow through as you inspect the code to figure out what it's actually doing. Half-baked DSLs are harmful IMO.
> Let's pretend he said integer since floats for money is outright broken.
This reminds me of a story I read on Reddit that someone told about his dog. The first time his dog saw a horse, the dog was excited, and ran up to sniff the horse through the fence.
It was an electric fence, and the dog touched it with his nose. The dog found this extremely unpleasant.
Now the dog is deathly afraid of horses, and runs and hides whenever he sees a horse.
On some 32-bit machines, a double precision float often exactly represent a larger range of integers than the integer types can, which makes it easier to avoid overflow problems you can get if you work in integers.
Of course, one might argue that the programmer using integers should be aware of this, and code accordingly, or else he shouldn't be writing code dealing with money (or any other thing were failure can have serious consequences).
However, the same applies to floating point. Floating point is not magic. A programmer who understands it can safely use it.
Many programmers seem to get burned early in their careers by using floating point without understanding how it differs from real numbers. Then, like the dog concluding that the horse, rather than the fence, was the problem, these programmers conclude that it is not safe to use floating point instead of properly concluding that they should not use tools they don't understand.
I think his point is that the OP got burned using floats for money. He thought the lesson was to use a custom class for money every time, but the actual lesson could have just been to use integers instead.
If your system is slow, you lose money. If your trading system says you earned $1,523,374.54, but your accounting system says you only earned $1,523,374.26, you don't care much.
I'm genuinely curious here: why would using integers be slower than using floats? I thought floating point operations were always more expensive than handling ints.
This is exactly the opposite of the advice recently given by Rich Hickey in his keynote at RailsConf 2012, where he strongly recommends using simple, transparent data structures without a lot of OO wrapping baggage.
Personally I'm on the fence but when you look at how much boilerplate this Virtus example needs to get rid of the code "smell" you have to wonder if maybe he has a point.
I think it just comes down to differences between the OO approach vs the functional approach.
In OO code I try to create types to represent every concept in my code. It seems overkill at first but I have found in most cases it pays off with reduced duplication. It almost always removes the question of where to put static utilities. Lets say you have a postcode - by creating a Postcode type you can put postcode validation, postcode formatting, etc, directly on that type rather than floating in different parts of your codebase.
Additionally, in a statically typed language it adds type safety - so for example you can't pass a postcode into an address parameter.
Do not hastily judge sticking to primitives as bad. Here's a relevant SICP quote:
"The simple structure and natural applicability of lists are reflected in functions that are amazingly nonidiosyncratic. In Pascal the plethora of declarable data structures induces a specialization within functions that inhibits and penalizes casual cooperation. It is better to have 100 functions operate on one data structure than to have 10 functions operate on 10 data structures."
The potential danger in computationally intensive code is that you have a number of wrappers around the actual data (e.g. in the article AttributeSet wraps a hash table), each requiring a pointer dereference and a vtable lookup.
So, IMO it depends on the language features available and the scenario whether it is a code smell. For instance, Haskell allows you to introduce a new type whose data representation is the same as the original type using newtype [1]. In contrast to a type alias, newtype makes a different type, but it still allows you to define functions on that type in terms of the original type (using the type's constructor). If you want to hide the underlying representation, you simply do not export the constructors from the enclosing module.
tl;dr: in Haskell you do get a better abstraction without the overhead. I am pretty sure many other language can do as well (private inheritance in C++?).
It's an interesting article, but I'm not really sure this is the best approach. In terms of having the 'best' code that is 100% OO then sure, but in terms of actually having usable, understandable, code I'm not sure I agree.
If you return a Hash of attributes everyone knows how a Hash works, but if you return a custom Attributes object people have to learn how it works, there could be bugs, etc.
We needed a set-like data structure with enumerable interface that would also know how to handle attributes from parent classes. That's why a custom object worked better.
oh and since it's an enumerable it's easy to work with it just like with a hash.
I'm also not saying you should never ever use primitive objects because that would be silly :) The trick is to determine when it is better to use a custom class instead of misusing existing primitive ones. Hash is a good example in Ruby, people use it way too often.
I understand about making code more clear, but I'm not sure adding complexity/boilerplate solves that in every case- the even more extreme case would be to do away with the class entirely and just have a hash type property that gets passed around.... his very verbose counter-argument is the other extreme, and while I see his point, I tend to want to start with the property and only move towards the "rich object" when I'm sure that it's the best solution....
I think people miss the fact that I wrote about a code smell, which means a potential problem in your code. I don't say you shouldn't be using primitives. I say that there are cases where you should use rich, custom objects, that will simplify your code and improve design.
absolutely there are those cases, maybe it was just an over simplified example.
Although I also wouldn't say that it simplifies your code... What it does is make the api to that class/object more explicit. At the very least you are keeping the complexity level the same, but you might also be making it more complicated. If you're just talking about the idea of OO encapsulation in general, that's another thing.... Not that I disagree that an explicit api is a good thing.
All code has a cost, primitives have a lower base cost because they are universal in the language and thus every programmer will automatically know how to use them. Before introducing a custom object with yet another API to be learned, you need justification.
> Implement Money class if you need to deal with money, it’s much better than using floats all over the place. Don’t use Hash for configuration objects, use a custom Configuration class instead.
Let's pretend he said integer since floats for money is outright broken. Yes, Money is a clearcut case where having a money-specific API is both intuitive and immediately useful.
Configuration on the other hand is debatable. If you're just doing things a hash does then it's fine. If you find yourself with lots of helper methods or repeating the same transformations over and over then that's when you have a code smell that may warrant lifting configuration to its own level.
But the critical thing is to remember that there's a non-trivial cost to creating domain-specific objects.