|
|
|
|
|
by msichert
44 days ago
|
|
Vector instructions usually don't have overflow flags, so a compiler can't easily vectorize loops containing overflow checks. However, detecting overflows in integer operations requires only a bit of bitwise arithmetic. In my experiments, this lead to an overhead of only 7% for vectorized additions with overflow checks: https://cedardb.com/blog/vectorized_overflows/ |
|
The real killer isn't the data operations, though, it's if the overflow checks interfere with converting the loop logic or data addressing to vectorizable form. Indexing with 32-bit signed int vs. unsigned int on a 64-bit platform in C is a classic case -- with unsigned the compiler cannot assume that addressing offsets don't wrap, which then prevents coalescing data accesses into vector loads and stores.