|
|
|
|
|
by janwas
525 days ago
|
|
Thanks for sharing :) Any thoughts on what kind of things you are looking for and didn't find? I cannot recall anyone saying this kind of thing is a bottleneck for them.
We don't use std::range, but searching for a negative value can look like:
https://gcc.godbolt.org/z/8bbb16Eea It looks like smaller codegen than EVE's https://godbolt.org/z/fEn9r175v? |
|
Can you write the second one two? With two ranges? That's where I believe the variadics will be.
FYI: The codegen is smaller because the loop is not unrolled. That's a 2x slower on my measurements. + at least I don't see any aligning of memory accesses, that'd give you another third improment when the data is in L1. You really should fix that.