|
|
|
|
|
by ori_b
5002 days ago
|
|
1) Loop versioning. You can check if the pointers are aligned, and then decide whether you want to take the fast path or the slow path. (A better example would be aliasing. Because pointers don't carry the size of the region they point to, we can't check if two pointers alas. Restrict solves this problem with programmer intervention.). I believe that GCC's vectorization code already does this sort of versioning. Example: testl $0xf,%ptr
jz aligned_path // in reality, fast path would be inlined here for cache locality reasons.
jnz aligned_slow_path
2) If your reads are naturally aligned, you will never be able to read a word that starts on one page and ends on another, so working in naturally the largest naturally aligned chunks you can is valid. This is a non-problem. (In fact, glibc takes advantage of this in it's assembly implementations of various SSE string functions.) |
|