Hacker News new | ask | show | jobs
by bendbro 2555 days ago
This doesn't make sense to me: "Right. Now consider the area function. Its going to have a switch statement in it, isn’t it?"

Perhaps I am nitpicking, or perhaps I am reading this wrong, but I would not design the square data structure to have a perimeter function. The square data structure should just expose the data that describes a square (length, width). Adding higher abstractions (perimeter, etc) on top of the data structure only serves to create the trumped up problem later described in the dialog. The perimeter method should be defined in the Square class, where perhaps a "StraightLinesOnlyPolygonMixin" could define the perimeter method.

In general, I cannot see why a Data Structure would define computational methods. You are tightly coupling logic to the underlying data source, which is wrong when that logic obviously could apply to any underlying data source (I don't care if my square is backed by RDB, S3, a hardcoded instance, etc) The perimeter method, and probably the Square class, should be the same.

2 comments

You are getting it wrong. Data structures do not own functions, instead they are passed to functions. So you have shapes and somewhere else you have perimeter function which has switch statement to determine the algorithm of calculation based on the type of the structure.
Ah, and what owns these functions?

And more pressingly, why would you ever pass a data structure to a function that had more than one algorithm to compute a result? The data structure (or perhaps some intermediary (an adapter?)) should own the algorithm within a function that computes only on that data structure. This ensures that all methods associated with your data structure are obviously and explicitly associated (in a single file, class, whatever). The alternative, as outlined in the dialog, is to spread a bunch of switches all around your code. Given these two possibilities, why would one choose to place switches in disparate places throughout your code?

Given these two possibilities, why would one choose to place switches in disparate places throughout your code?

You are touching on something called the "expression problem". You might be interested in reading some commentary about it. There is an inherent decision to be made any time you have many algorithms each operating on many data types. In most programming models, you have to choose between grouping your code based on your data types (but then any new algorithm needs to be implemented on each data type) or based on your algorithms (but then any new data type needs to be supported by a new case in each algorithm). Neither is the "right answer". You fundamentally have a two-dimensional system here, and you need to decide which axis you are going to prioritise in your design.

Thanks. I will look into this. I would like to see examples that obviously require algorithm-grouping or type-grouping, as in my experience, algorithm-grouping has always lead to headaches.

I think I have encountered this issue in the past but was turned of by the lack of formality in the discussion. I wish there was more academic, concrete discussion of these issues, because I feel that what I am doing now (informal discussion) likely has many holes.

Did you ever write a program on C? In procedural languages functions are either global or, sometimes, encapsulated in namespaces or modules. Data structures come from that world and do not encapsulate any behavior. In object-oriented languages like Java data structures simply do not exist and are usually emulated via anemic models.
Ah, thanks, that makes perfect sense.
Nitpicking too: perimeter != area.
The full quote: "Right. Now consider the area function. Its going to have a switch statement in it, isn’t it? Um. Sure, for the two different cases. One for Square and the other for Circle. And the perimeter function will need a similar switch statement"

I should have stuck with the same metaphor, but regardless of whether it is perimeter or area, the output is a function of the description.