Hacker News new | ask | show | jobs
by Kipters 1119 days ago
Yeah, that work in C# required a lot of other things to minimize allocations.

What I was thinking is something similar to how they implemented things like IndexOf [0] which is a pure C# implementation that gets translated by the JIT in C++ equivalent code. The advantage is of doing this kind of things this way is that when ARM adds a 256-bit wide SIMD extensions they will only need to support that as a Vector256 implementation to get that code working with no other changes.

[0]: https://github.com/dotnet/runtime/blob/2a1b52a1b691c42a7f407...