Hacker News new | ask | show | jobs
by nostrademons 3777 days ago
I thought that was one of the most fascinating parts - Rust's borrow-checker enforces the Law of Demeter and Principle of Least Privilege as a side-effect.

Code that takes a full structure when it only needs to operate on a part of the structure is badly designed. It's not conveying the full information about the data that it actually needs, which means that unexpected dependencies can crop up, implicit in the body of the function, as the code is modified later on. This is behind a lot of long-term maintenance messes; I remember a few multi-year projects at Google to break up "data whales" where a single class had become a dumping ground for all the information needed within a request.

Thing is, we all do it, because taking a reference to a general object and then pulling out the specific parts you need means that you don't have to change the function signature if the specific parts you need change. This saves a lot of work when you're iterating quickly and discovering new requirements. You're trading ease of modification now for difficulty of comprehension later, which is usually the economically wise choice for you but means that the people who come after you will have a mess to untangle.

This makes me think that Rust will be a very poor language for exploratory programming, but a very good one for programming-in-the-large, where you're building a massive system for requirements that are largely known.

2 comments

> I thought that was one of the most fascinating parts - Rust's borrow-checker enforces the Law of Demeter and Principle of Least Privilege as a side-effect.

I really don't agree with this comment (I program in Rust a lot). Borrowck is a godsend in many ways, but this is a weakness (that can be improved!). It prevents things like `self.mutate_my_foo(self.access_my_bar())`. There are workarounds for the problems this presents, but they should have to be.

It is _great_ that borrowck helps you control aliasing and mutation of state. It is _frustrating_ that borrowck can't distinguish a borrow of `self.foo` from a borrow of `self`.

"Law of Demeter" is enforced through privacy - as the `self` in the example above shows, these are all happening within the private scope of the 'object.'

The "Law of Demeter" means something very specific in OO programming, often expressed by the rules:

  1. You can play with yourself
  2. You can play with your own toys (but you can't take them apart)
  3. You can play with the toys that were given to you.
  4. And you can play with toys you've made yourself.
http://c2.com/cgi/wiki/LawOfDemeter?LawOfDemeter

Put simply, it means that you shouldn't attempt to destructure or inspect the arguments that were passed to you. If you're passed a point and need to access point.x and point.y, then you're a method on the wrong class; you should be a method on Point instead. If you're passed a file but only need to access file.path, your parameter type is wrong: you should take a filepath instead and let your caller destructure for you. If you need to access foo.bar and foo.baz but foo has 20 data members, you should collect bar and baz on its own sub-structure and pass that in directly, or better yet, make your function a method on the sub structure. If you need to self.mutate_my_foo(self.access_my_bar()), you should call self.foo.mutate(self.access_my_bar()). And so on - the point is for each function to have the minimal knowledge necessary to complete its task, and any decisions unrelated to that task should be propagated up to higher levels of the program.

I won't deny that this is frustrating, and I thought I acknowledged that in the original comment. The Law of Demeter has been very controversial in OO circles, because it's so restrictive that pretty much nobody can actually adhere to it without creating so much work for themselves that their project ships late. In forcing your code to always use the minimal set of data necessary, you force yourself to change the code (including many potentially highly-used APIs) every time you add or remove a data dependency, which is usually impractical. The whole category of dependency injection frameworks was invented to automate much of this plumbing work.

But I find it fascinating that Rust's borrow-checker has basically forced it down on one side of the tradeoff. It has a bunch of implications for what Rust is good at and what Rust is not good at.

I understand the law of Demeter (and like many OO design patterns, I think it misses the mark). You don't know Rust well enough. `self.foo.mutate(self.access_my_bar())` is a borrowck error in the same way as `self.mutate_foo(self.access_my_bar())`. This also is only true in the event that one of the calls involves mutation. Really, this has nothing to do with the Law of Demeter, borrowck will find things that do not violate the law of demeter in error, and will allow things that do.

The issue here has to do with 1. borrowck's inability to infer the delimitation of mutable access to members of product types vs to the entire type, 2. borrowck's limited understanding of the order of evaluation.

I love Rust and want to write in Rust all the time, but you are overhyping the benefits of borrowck.

EDIT: A reason the law of Demeter is not great, in my opinion, is exactly a strength of borrowck - the issue isn't how much state a scope can read, but how much state a scope can write.

>Code that takes a full structure when it only needs to operate on a part of the structure is badly designed.

No, it is not; this is done all the time with methods and it improves encapsulation - you may not want your clients to be able to decompose your data structures. Do you really mark every member of your data structures as pub??

Sorry but this is a poor ad-hoc defense of an actual annoyance in the borrow-checker.

>Code that takes a full structure when it only needs to operate on a part of the structure is badly designed.

I think this was a sensible statement, especially in context; I strongly agree that "this is behind a lot of long-term maintenance messes". And lost performance.

For encapsulation in Rust, traits are used to abstract and separate concerns, but they don't force you to bundle your data into large structures.

And encapsulation isn't an end in itself. Privacy has its uses (maintaining invariants, minimizing the exposed surface area of a library, etc.) but I find often in OO codebases that encapsulation creates its own problems. There is no substitute for careful data-oriented design; no amount of `private` will prevent your teammates from working around or ripping apart your carefully shrink-wrapped objects.

There is certainly some awkwardness in the borrow checker, but also great value.

If you're trying to achieve proper encapsulation, you just have a module that implements some sort of functionality, and shouldn't need to borrow anything from it. The real question is why you're pulling data instead of pushing messages.
This is incorrect. "Sending a message" involves borrowing the data so that the method can run. `foo.bar()` borrows `foo`.
Sorry, I wasn't clear. You are, of course, correct: you're only ever in a position to call a method if you hold a reference to the struct you're calling on.

My meaning was that you should favour a usage pattern that looks like you either move/copy things into the called method, or lend a reference to something you own (which is, presumably, not going to be held on to for very long), and then you're either given ownership of whatever return value you get, or get a reference whose lifetime depends on the arguments you passed in (but not the object itself). All of this ends up being quite clean, and you don't end up tying yourself into a borrowing knot.

You do end up in a weird place when your methods return references to fields of the owning object. When that happens, you're restricted in what you can do with the owning object until the reference goes out of scope. Rust mutexes are implemented precisely like that, which highlights what sort of behaviour you're getting from this usage pattern.

The former provides better encapsulation and more closely resembles the message-passing approach to OOP, whereas the latter pattern is not only not very ergonomic, it's quite indicative of poor encapsulation (because you're, by necessity, asking for internal state).

Here is the issue: `self.foo.bar(self.baz())` is an error if `foo.bar()` mutates foo, even if `baz()` doesn't touch `foo` and even if `baz()` doesn't return a reference. This is because borrowck doesn't properly understand that baz will be evaluated before bar, and can't distinguish which elements of a struct are accessed by that struct's methods. Both of these are problems that can be solved, and neither of them is actually promoting good practice in my opinion.

All it does is force you to use unnecessary temporaries, like `let baz = self.baz(); self.foo.bar(baz)`

Yeah, this is basically a borrowck "bug" which will probably be fixed post-MIR.

Note that there are cases where such code is invalid even with the temporary, and they can be related to Demeter. Ish. Also to API contracts; the guarantee should be embedded in the signature (so changing the internals shouldn't cause its usage to stop compiling), which is unweildy to do.