Hacker News new | ask | show | jobs
A Case for Properties: Programming Language Design (blog.willbanders.dev)
39 points by WillBAnders 2047 days ago
10 comments

> More modern languages, like C# and Kotlin, use properties for this instead which can do that automatically.

It's not about being "modern" but about differing opinions on language design. The subject of adding properties to Java has come up numerous times over the past twenty years, and even though we're adding quite a few new features -- most recently, algebraic data types -- properties are not added because we just don't want them. Their "bang for the buck" is low, their potential for abuse is high, and adding them to an established language is even less appealing, as it creates a clear old/new code divide. Most of all, it's unclear whether the style of programming that relies on mutating individual fields is something that should be encouraged in the first place.

Another controversial feature that C# and Kotlin have but we don't want to add to Java is extension methods. It's perfectly reasonable for different languages to reach different conclusions regarding certain features, with some biased more toward happily adding features while others are more conservative, and it's perfectly reasonable for different programmers to have different preferences. The important thing is that features are carefully considered by language designers before being added, and that the language's "feel", philosophy and context is taken into account.

> Their "bang for the buck" is low, their potential for abuse is high, and adding them to an established language is even less appealing, as it creates a clear old/new code divide.

I'm not a Java programmer, but I agree wholeheartedly WRT addin a feature like this to an existing language with wide use. There's a world of difference between adding a feature that's obviously different than what came before and looks obviously different than adding one that changes functionality but looks identical to prior usage. If people learn a language and during that it's made clear that assignments can have arbitrary side effects, they'll expect that. If they learned the language and that wasn't possible, introducing it at some point is going to mean a lot of programmers have ways of thinking about how control flow works and what happens when you make assignments that is subtly (or greatly) wrong, but only sometimes.

As a programmers, what that can feel like is that the language is actively sabotaging your understanding of it. It's not fun.

I like your way of politely explaining „thanks, but no, thanks“ and totally agree with it. Properties are usually useful in cases where small immutable record would also be fine. With general purpose property interface of a complex business model it is too easy to make mistakes even for an experienced engineer. Validation or computations in constructor or business method would be more natural and flexible.
I rather like extension methods. I was not aware there was any controversy around them.

Edit

This wikipedia article suggests Java has extension methods, or maybe something functionally equivalent (I'm not a Java programmer): https://en.m.wikipedia.org/wiki/Extension_method

> I was not aware there was any controversy around them.

Not controversy so much as that they're not close enough to being universally liked.

> This wikipedia article suggests Java has extension methods, or maybe something functionally equivalent

That uses a compiler plugin that changes the language quite considerably.

Good article. A nice tour of properties and their pros and cons.

Personally I don't like them, for the reason labelled in the article as Problem 2: Arbitrary Computation. Computation should look like computation. I like it to be perfectly clear which statements can throw, and which cannot. I prefer List.getSize() over List.size precisely because the implementation should be permitted to perform computation, and it looks wrong when the code implies otherwise. I like to know that after executing Universe myUniverse; myUniverse.answer = 42; the value of myUniverse.answer really is 42. edit Ignoring the slim possibility of overflowing the integer type of myUniverse.answer.

Properties make it harder to reason about these things. Similar madness is possible in languages that allow overloading the assignment operator, such as C++. Come to think of it, you could probably 'fake' properties in C++ this way.

I accept that properties can make for slightly tidier expressions. I also like the point that properties are a single unit, tidily united in a way that getter/setter pairs aren't. Overall though I value 'honesty', and the ability to reason clearly about code, over these things.

I can't imagine Ada ever supporting properties, for the same reasons.

You bring up a good point that I never addressed the case where setting a property to a value and getting it could return a different value - this is something I overlooked, and I would say absolutely needs to be true. I also didn't do a good job restricting computation, selecting time complexity as a metric instead of literally anything else - there are far better ways of doing that.

There's more to properties than just syntax-sugar as I initially said (such as being a single unit), and I think I've built more of a case for why using property-based getters/setters over fields are better that working with fields directly. If the syntax sugar isn't your cup of tea but you agree with using methods over fields, I think that's the limit of my argument here.

> I also didn't do a good job restricting computation, selecting time complexity as a metric instead of literally anything else - there are far better ways of doing that.

A tricky one, saying something as tidy as no side effects clearly doesn't work; anything we do has to be a side effect, as our hands are tried regarding return values. I don't think it would be practical to come up with perfectly precise rules for what is appropriate, only guidelines. Even never throw an exception seems too strong, as you might want to do range checks. Never block would complicate logging.

> If the syntax sugar isn't your cup of tea but you agree with using methods over fields

Yes, I think that's where I'm going with this. The advantages of avoiding public fields seem clear enough, we just disagree about the merits of properties.

Doesn't Ada have a sort of arbitrary computation on assignment? I think I remember that there are rules when predicates on types are checked. (Depending on the Assertion_Policy of course) I'm not sure how that works on assignment though.
Good point, Ada's runtime range checks mean that an assignment can raise an exception. [0] Part of what SPARK Ada does is to prove the absence of such constraint errors. [1]

I don't think Ada supports arbitrary user-specified constraints, but I'm no expert.

[0] https://gcc.gnu.org/onlinedocs/gcc-10.2.0/gnat_ugn/Run-Time-...

[1] https://docs.adacore.com/spark2014-docs/html/ug/en/usage_sce...

You are right that SPARK tries to prove the absence of these errors. In that case you don't need these checks and turn them off. If you don't they are still run.

Ada has Dynamic_Predicate(s) which "can be any Boolean expression". [0] So, I'm guessing, you can write a pointless Predicate which does not terminate in some cases or takes unreasonably long.

[0] http://www.ada-auth.org/standards/12rat/html/Rat12-2-5.html

Thanks, I wasn't aware of that feature, I think you're right.

> In the case of Dynamic_Predicate, the expression can be any Boolean expression.

Ada functions are permitted to have side-effects (although this is disallowed in SPARK), so I imagine it would be possible to make a real mess if you wanted to.

> Computation should look like computation.

If you follow standard naming practices in C#, properties are distinct from fields, which means they do "look like" computation once you're used to it.

Add null-conditional operators and null-coalescing ones, and you can get code that is overall much more readable and easy to reason about.

If you follow the member design guidelines, you will not have any public instance fields anyway: https://docs.microsoft.com/en-us/dotnet/standard/design-guid...
Good point. A language could even mandate such a style, making the ambiguity issue go away entirely.
Yes, and the parentheses of methods obviously work as a compiler-mandated "style" that make it obvious that you're running a method. I know Visual Studio warns you about properties with the wrong casing, and I'm pretty sure you can make the compiler throw errors for those as well, so you could enforce it on a project level at least, even if the language itself doesn't.

But I agree with you on the Kotlin examples in the article. Even though it's been a while since I last did Java, my gut reaction is that it looks wrong, because it looks like I'm directly accessing a field.

So you need a little bit more than just the feature itself, you need something to distinguish them from fields when you're looking at code, or you're gonna run into surprises. Whether that is casing or syntax highlighting or naming conventions, I don't care very much, as long as there's something.

Of course, lack of discipline can screw it up, but you could name the getter for x "setY()", and make it delete the entire filesystem when you call it. You can't guard against stupidity like that...

> the parentheses of methods obviously work as a compiler-mandated "style" that make it obvious that you're running a method

The ambiguity is between properties and fields, not between properties and methods. Looking at the style guide, I don't think there's any difference between how public fields and public properties are styled in .Net. [0]

> Whether that is casing or syntax highlighting or naming conventions, I don't care very much, as long as there's something.

Ideally it shouldn't rely on fancy tooling. I'd rather it be clear in the plain text of the source.

> you could name the getter for x "setY()", and make it delete the entire filesystem when you call it. You can't guard against stupidity like that

It's true, we don't yet have a language that prevents poor choice of identifiers.

[0] https://docs.microsoft.com/en-us/dotnet/standard/design-guid...

C#s uses these heavily, and most of my professional career has been using C#. I understand the convenience they offer but I do not think they are worth the complexity they add to the language. There is a whole zoo of syntax around them, as the author states arbitrary computation can sneak up on you when you don't expect it, and ultimately all you are doing is creating some shorthand for functions.

If you like, make some shorthand for creating a getter/setter pair that link with some field, that would be fine. But keep them functions, so that it is clear that they are functions to consumers.

I think there is a simple way around the complexity: Make the field "foo.bar" indistinguishable from the outside from the getter / setter pair bar() / set_bar(). This has the disadvantage of losing the ability to distinguish method calls and references by the presence of parens, but I think overall it should pay off.

I have written up some more details here: https://stefan-haustein.com/simplifying-properties

This is also known as the uniform access principle: https://en.wikipedia.org/wiki/Uniform_access_principle

I believe Ruby and Smalltalk more or less work that way.

Thanks a lot for the pointer! I was initially distracted by listing C#, JS and Python as examples, but the article clearly states:

"Many programming languages do not strictly support the UAP but do support forms of it. Properties, which are provided in a number of programming languages, address the problem Meyer was addressing with his UAP in a different way. Instead of providing a single uniform notation, properties provide a way to invoke a method of an object while using the same notation as is used for attribute access. The separate method invocation syntax is still available."

Eiffel was one of the first languages to introduce it that way.
> The largest benefit of properties is replacing field use, which has a massive amount of drawbacks, with methods instead.

Could someone please flesh this out for me? I've always found getters and setters to be promoted by people following standard practice, without actually thinking about the underlying principles.

I'm with you. I also enforced getters-and-setters back in the day, for no good reason.

If you're worried about long diffs because you changed many calls from '.x' to '.getX()', good. If it's so coupled that it's called in a thousand places, and you just changed functionality, it's on you to understand those thousand places. If it doesn't affect those places - which you don't know unless you look - then don't make the change.

As for the RgbColor example in the article, not using setters would be better.

    RgbColor red = new RgbColor(1,0,0);
    red.set(.., .., ..);
    // Is it still 'red'?
With the RgbColor example, a language that has mutability permissions (like Rhovas) resolves this easily. If you don't have mutable access to the object, setters can't be used.
It's entirely about backwards compatibility and commit cleanliness.

Say you have an int field and you want to change it to an int array field. To support existing code, you add a method that returns the first element of the array (let's assume it's guaranteed non-empty for simplicity).

If you used a field, you have to update all code referencing that field to call the accessor method instead. If your class was part of a public API, it's a breaking change.

Even if it was just internal code, your next commit will contain a zillion changed files where you replaced foo.number with foo.getNumber(). You'll leave a trace in each of those files' histories for a change that really didn't concern them at all. Also code reviewers will have to check that you didn't miss a usage somewhere (hopefully your code would just fail to compile in that case though).

If you had used a property or a getter around the original field, none of that happens. You add a new .Numbers property and replace the old .Number implementation with one that gets or sets the first element of the array.

> Say you have an int field and you want to change it to an int array field.

Will you ever do that in real life? Seriously, why would someone change something from an int to an array...

I see all these Java programmers say that you should always use getters/setters and not expose public field in classes for this reason, yet I have to see an example where this would have happened.

Also the argument that you will need to touch more files in a refactoring. Sure. Not doing so would mean adding a technical debt, a method that exists for the sole purpose of some code that you were too lazy to update. Do that and your software will become spaghetti code easily.

I see this typically Java code that is not readable for all these getters, to the point that a simple class that represent a user is like 200 lines of code long for all these getters/setters that only return or set fields of the class without doing nothing.

Is also a matter of performance (no wonder Java code is that slow), since you are calling a function for each property access (and a function call is an expensive operation, it involves jumps, you have to access the vtable for the object, set up the stack frame) where you could simply access a field in an object (a simple addressing with an offset for the CPU). Sure, a call is not that expensive. But imagine doing that in a loop and it will be a performance problem. It will make also the CPU optimization work more difficult, since you are jumping around.

> Will you ever do that in real life? Seriously, why would someone change something from an int to an array...

This happens quite often, when one-to-one relationship is replaced with one-to-many. In this specific example it could be an integer identifier of an object. If you need a real life example, think of an item in online shop, which had a category assigned and you now want to assign multiple categories.

On a side note, nothing I said justifies the use of properties. Refactoring is heavily automated today and users of an API undergoing such a dramatic change would have to review all uses of this value anyway. Having broken build would guarantee that this happens.

> Is also a matter of performance (no wonder Java code is that slow)

Surely these would be trivially inlined?

All public non-final methods in Java are virtual, so it is not _that_ trivial. You would have to check all subclasses that might get passed as an argument and check that again whenever new classes are loaded into the JVM.
JVMs are quite smart about this sort of thing, and a nonfinal method can still be inlined, even though the JVM has to safely undo that optimization if a class is loaded. I'm not sure where I learned this originally, but Aleksey Shipilev has a post on method dispatch that should explain it. https://shipilev.net/blog/2015/black-magic-method-dispatch/
It's exactly what the JVM does, it's called Class Hierarchy Analysis (CHA). The VM checks if a method is actually overridden or not.

When a new class is loaded, the analysis checks which codes are no longer valid and mark them for deoptimization.

Direct field access sets the internal state of an object from outside of it's implementation. I would say the key underlying principle here is messaging passing - the object chooses how it responds to API requests it receives - and fields don't allow this to happen.

Fields prevent data validation, read-only fields, and polymorphic behavior. A particular edge case is that you can shadow a field in a subclass, and it can change the field which is accessed in other untouched code.

Getters/setters fix all of these limitations and others without considering API compatibility. IMO, the above are more important because they relate to the correctness of the code.

Getters and setters for private fields provide additional flexibility not afforded by the fields themselves. Examples from the article include different visibility for Getters and setters, validation, and opportunity to execute additional code.
I would also like to know. I think "massive amount of drawbacks" is actually "one not-very-important-in-practice drawback". Which is just that when changing the behaviour of a class, you have to change its API.
I'm too scared to mention not using getters/setters in interviews.
Strongest case I can make for properties is that a property chain is both readable and writeable, so it makes two way UI binding simple and intuitive:

   <input src="Contact.Email" ...>
This is solved by resolution mechanism in template parsers, for which you don’t need properties.
The template parsers are typically one way and not native to a language.

A language that supports properties can make this trivial for both the inward and outward binding.

Otherwise you end up with janky stuff like JavaBeans or whatever.

Can you elaborate more on this? Are you familiar with implementation of engines like Thymeleaf or Jackson, for example?

JavaBeans are not the only possible way to go with providing values for templates and definitely one of the worst.

I've implemented two way binding in templates, in a proprietary templating/UI language on the JVM. Thymeleaf, iirc, is one way binding. Don't know anything about Jackson, so I can't really say.

When properties are implemented in the language proper then wiring everything together is much cleaner: find references, etc, all "just work". If not, there is a distinction between the template language and the host language that has to be addressed with tooling, or (usually) ignored.

One thing that's a bit unclear to me about properties, as described in this article, is how they should behave with respect to serialization.

For instance, in C#, both System.Text.Json and Newtonsoft.Json will get and serialize all properties, regardless of whether they are virtual "projections" (like the "red" property in the article's RgbColor example) or "plain" properties. Similarly, all properties are deserialized and set when doing the reverse.

This has the consequence that you may have redundant information in your serialization format. That's not the end of the world, and in fact there are some use-cases for this. The bigger consequence is that you can have redundant and conflicting information in the serialized data you read in.

For instance, adapting the author's example to C#:

https://try.dot.net/?fromGist=1bb01bd2b78a3212c52b40ff7955c0...

You get a different deserialization result depending on the order of the properties in the JSON, because the last property wins and the properties are mutually inconsistent with respect to the underlying data model.

Is the conclusion from this that objects which you intend to serialize should avoid using this "properties" feature, or else be extremely deliberate about what properties to include in the serialization format?

This is an interesting question, I was not aware of this behavior in C#. It looks like the solution is to use annotations to specify a property as being ignored.

The intended behavior is to serialize all fields of an object (some properties have backing fields, others don't). You wouldn't serialize properties themselves for the same reason a getter/setter on it's own wouldn't be serialized - only the data stored.

Put simply, I would say C# has the wrong default behavior here.

I don't know about JSON but the XML serialisation code checks for a DefaultValue attribute and if the value is the default it ignores it on serialisation. You can also ignore properties if they don't need serialisation.

Your example looks sloppy but it can get messy as the project evolves. I had some code where I mistakenly removed some XmlAttribute attributes in one release and it was painful to resolve in the next release.

Having said that I will still use it again, just carefully.

In order to produce code with some agility, it is best to keep data private, adhere to SOLID and have serious thoughts about what complexities you need to introduce when.

I don't see any actual need for properties, when the same effects apart from syntactic sugar can be had using methods, but one really shouldn't.

The idea being not to optimize writing code once, but greater and easier ability to read and change code later.

This puts me in a camp where I don't think it's a great idea to always tie process objects to serialization formats and APIs either. Certainly, layers and MVC would require serious explanation why enforce these structure in code other than theoretical handwaving, ie. what is the value now? Hexagonal and clean architecture? Later, maybe.

Of course, put in front of C# or anything enterpricey, there's no choice but to engage in these things.

I used to feel this way, not for things like "validation on assignment" which is a terrible idea for so many reasons, but because it allows you to build things like dependency graphs and change watchers at runtime (think mobx, or Adobe Flex's bindables system)... But these days, I think we'd be much better suited to moving most of this work to build-time tooling.
Soo, Smalltalk messages are back? :)
Yes! I wanted to address this idea in the article, but it didn't make the final cut as I didn't think it added much to the points being made. Moving away from direct field access and only using functions is definitely in line with the Alan Kay OOP model.
Properties in modern languages?

They made their debut in Eiffel and Delphi, so 90's post modern.