Hacker News new | ask | show | jobs
by colejohnson66 1032 days ago
I really love C#, but one thing I wish it had (that Rust does) is move semantics. In C#, if someone passes your function (such as a constructor) an array, you have no guarantee the caller won’t modify it underneath you. In Rust terms, you would have a mutable reference, but the caller also does. Sometimes this is desired, and would be usable in Rust with a Cell, but other times it’s not. This can lead to defensive copying of arrays by the callee.

If I could annotate a parameter with some kind of “move” keyword that would prevent the caller from using it again, that would be great.

“Frozen collections” and ImmutableArray<T> can solve this issue, but the latter is essentially just a defensive copy of the array, but in a special type. I'm not holding my breath that such a thing would ever be implemented; Analyzers will probably be the best we get.

3 comments

I was going to say that Microsoft appears to have chosen to address this at the library level with brilliantly optimized frozen and immutable variants of the collections, but you beat me to it.

It probably would have been easier and cleaner to implement some form of `restrict` and `const` semantics (or maybe move, like you mentioned) than to work as hard as they did to come up with a still sub-optimal solution, but the performance they’ve managed to eek out of the frozen and immutable variants is to be commended.

Oh, for sure. Microsoft's dotnet team deserves massive props for the effort they've put into the runtime. As for why there's no `restrict`, `const`, or `move` semantics, it seems the runtime team is extremely averse to modifying the CIL's capabilities as it would be a breaking change; They'd rather have the library team add classes/attributes that the runtime recognizes and special cases.
Wouldn't this just be ImmutableArray or ReadOnlyCollection?

    > “Frozen collections” and ImmutableArray<T> can solve this issue, but the latter is essentially just a defensive copy of the array, but in a special type
OK. Or just declare your parameter as IEnumerable<>? This has the effect of restricting the operations on the incoming collection in the same way.
Obviously, if this is a processing function that iterates over the array and forgets about it, IEnumerable<T> or IList<T> (IReadOnlyList<T> to communicate intent) would be the better option.

My thoughts involved constructors. In those, sure, I could take IEnumerable<T> or whatever else there is, but if I want to store an array (or list) as a field in my class, I'd have to make a copy with ToArray() (and friends). Being able to "move" an array from the caller into the constructor (callee) would be nice.

ReadOnlyCollection<T> isn't actually read only, but just a wrapper around an array/list. IReadOnlyList<T> also isn't read only, but just restricts me from editing it. I can't edit the parameters, but the caller possibly could, and that's my issue.

    > I can't edit the parameters, but the caller possibly could, and that's my issue.
This seems more like a concurrency issue, then. If that's the case, it seems like the right answer is some synchronization primitive or perhaps using a `Channel<T>` to serialize the flow.
Sometimes it's concurrency, but most of the time it's that I want to know I own the array. I want to know that `_array[5]` will be the same, no matter when I access it (excluding intentional modifications by callee). For example, if I take `T[]` as a parameter to my constructor, I might think `_array[5]` will be the same, but nothing stops the caller from doing `Array.Sort(theParameter)` sometime later.

    // in one function of the caller's type
    _data = new int[] { 1, 2, 3, 4, 5, 6, 7 }; // could come from some data collection engine
    _dataTable = new(_data);
 
    // then, sometime later, in another function of the caller's type
    Array.Clear(_data); // reset for next iteration or something
    // data table now has an array of zeros if it didn't copy the data
This is just an example, and the problem is not only limited to arrays; Mutable classes can be mutated by the caller after the callee is given them. Basically, the issue is multiple mutable references. The only solution to the above problem is a defensive copy by the "data table" constructor. I would like the ability to say, "the caller cannot modify this reference anymore".

This is something Rust gets right, IMO. If I want the callee and the caller to share mutable ownership, RefCell<T> or Mutex<T> can be used, and the usage of such a type makes that clear. If I want thread safety, I can wrap the mutex in an Arc<T>. And, if I don't want shared ownership, a plain T can be used, and ownership passes into the callee. The issue is: in C#, without defensive measures, all objects are just T* with no ownership or thread-safety.

> OK. Or just declare your parameter as IEnumerable<>? This has the effect of restricting the operations on the incoming collection in the same way.

It does not work that way, because it is an `in` parameter, thus you can still accept any mutable type (including an array) which implements the essentially immutable IEnumerable interface you require.

Now if it was about a return value, it would be a different story.

edit:

Still what you suggest has its merits in the other direction, as it gives the caller a guarantee that the passed in collection will not be modified inside. Had to untangle broken code (without unit test when i got it) that was called like something `CalculateCost(SomegraphData input)` and while its name (and xmldoc) did not suggest, it did subtle modification to the data it got handed in... I was very upset about that legacy codebase I just inherited when I found that...

I too really love C#. That said, C# biggest weaknesses IMO are all around a lack of proper const-like markup. Given that TypeScript (another language by the same designer) exhibits similar weaknesses, I've always interpreted it as an intentional weakness introduced by the language designer.

Working in a fully const-correct codebase in C++ is a joy. Working in a partially const-correct codebase in C++ is a nightmare because everything has to work around the semantic expectations of everything else.

Anders Hejlsberg, and those who took over the C# language design after he moved on to TypeScript, were certainly aware of how to make the language fully constable. The fact that they didn't leads me to believe they viewed it as a trade off between how much benefit do you get from using it vs how much cost do you take on from having that markup need to be everywhere in your entire language ecosystem. It's not enough for one application architect to say they want to take it on. Every engineer working on every library and every potentially reused function would have to take it on, because constness done right either tendrils everywhere or is a lie, and there are big costs to extending those constipation tendrils, just as there are big costs to not having them.