Hacker News new | ask | show | jobs
by incoming1211 398 days ago
Is there a reason these sort of improvements cannot be contributed back into .NET itself?
6 comments

ZLinq relies on its own enumerable type called ValueEnumerable, which is a struct. While it would probably work when using this as a drop-in replacement and re-compiling, things will be more complicated in larger applications. There might be some code that depends on the exact signature of the Linq methods. This might not even be detectable in cases involving reflection and could break stuff silently.

Adding another enumerable type would be a very large change that could effectively double the API surface of the entire ecosystem. This could take some time. Some places still don't even support Span<T>. Also there were some design decisions related to Linq where the number of overloads were a consideration.

Adding this API to .NET could probably be done with that extension method that converts to ValueEnumerable. But without support for that enumerable, this would pretty much be a walled garden where you have to convert back and forth between different enumerable types. Not that great if you'd ask me, but possible I guess.

I can easily imagine the kind of person that goes out and builds something like this would have little patience with the bureaucracy of getting it integrated into .NET.
I'd say it's less about bureaucracy and more about what the .NET team has to consider when they make sweeping changes.

Backwards compatibility, security, edge cases, downstream effects on other libraries that are reliant on LINQ, etc.

One guy with an optional library can break things. If the .NET team breaks things in LINQ, it's going to be a bad, bad time for a lot of people.

I think Evan You's approach with Vue is really interesting. Effectively, they have set up a build pipeline that includes testing major downstream projects as well for compatibility. This means that when the Vue team build something like "Vapor Mode" for 3.6, they've already run it against a large body of community projects to check for breaking changes and edge cases. You can see some of the work they do in this video: https://www.youtube.com/watch?v=zvjOT7NHl4Q

I think this approach predates Vue.

I know of two examples:

1. Fedora in collaboration with GCC maintainers keep GCC on the bleeding edge so it can be used to compile the whole Fedora corpus. This validates the compiler against a set of packages which known to work with the previous GCC

2. I think the rust team also builds all crates on crates.io when working on `rustc`. It seems they created a tool to achieve that: https://github.com/rust-lang/crater

I would assume the .NET guys have something similar already but maybe there’s not enough open code to do that

Rust also has the advantage of having no ABI. Binary interface is a whole lot more difficult to maintain than code interface.

C# has multiple technologies built to deal with ABI (though it probably all goes unused these days with folder-based deployments, you really need the GAC for it to work).

IIRC perl tested new releases by running all the unit tests in the CPAN library, waaaaay back when.
They still do and investigate each failure. If the end result is that the library is “wrong” tickets and patches get sent to the library maintainers.
You have to add an extra function call at the start of the Linq method chain in order to make it zero-allocation. So I don't think it would break backwards compatibility. But adding it does create an additional maintenance burden.
From some experience, the MS guys are actually really eager to get more outside help and many will help guide you through the process if you have something to offer.

Every release has a fairly decent amount of fixes and additions from outside contributors, and while I can see a lot of to/fro on the PRs to get them through, it's probably not quite as bad as you'd expect.

From looking at the blog post I suspect the explosion of generic instances could be a serious problem for code size and startup time, but that's probably solvable somehow. The performance certainly seems impressive.

The way LINQ currently works by default makes aggressive use of interfaces like IEnumerable to hide the actual types being iterated over. This has performance consequences (which is part of why ZLinq can beat it) but it has advantages - for example, the same implementation of Where<T>(seq) can be used for various T's instead of having to JIT or AOT-compile a unique body for every distinct class you iterate over.

From looking at ZLinq it seems like it would potentially have an explosion of unique generic struct types as your queries get more complex, since for it to work you potentially end up with types vaguely resembling Query3<Query2<Query1<T>>>>. But it might not actually be that bad in practice.

Using reference types are more idiomatic in C#. To some degree they are less bug prone as well (they can be passed around without issue). Most of the core library use them instead of starting with value types and boxing.

The Task library has successfully added ValueTask but it took some doing. LINQ on the other hand can be replaced with unrolled loops or libraries more easily so the pressure just hasn't been there.

I could see something happening in the future but it would take a lot of be work.

To your point, ValueTask is less safe than Task. For example, it's important not to await it more than once.
There are some minor breaking changes like the order of iteration is not always the same as the official Linq implementation, or Sum might give different values due to checked vs unchecked summing. Probably not an issue for most people, but a subtle breaking change nevertheless.
I don't see why not: https://github.com/dotnet/runtime/pulls

There's an official process for API change requests: https://github.com/dotnet/runtime/blob/main/docs/project/api...