|
|
|
|
|
by Kipters
1119 days ago
|
|
Yeah, that work in C# required a lot of other things to minimize allocations. What I was thinking is something similar to how they implemented things like IndexOf [0] which is a pure C# implementation that gets translated by the JIT in C++ equivalent code. The advantage is of doing this kind of things this way is that when ARM adds a 256-bit wide SIMD extensions they will only need to support that as a Vector256 implementation to get that code working with no other changes. [0]: https://github.com/dotnet/runtime/blob/2a1b52a1b691c42a7f407... |
|