|
|
|
|
|
by gpderetta
1072 days ago
|
|
If they are not in the critical path, it doesn't matter. There is no instruction cache issues as the loop is tiny. Also as the loop is tiny it will fit in the u-op cache (or even in the loop cache), so decoding is not an issue either. The only problem is potential lack of vectorization, but a good vector ISA can in principle handle the bound checking with masked reads and writes (but now the check is no longer a predictable branch, but it might end up in the critical path, although it is not necessarily a big cost, or even measurable, anyway). |
|