|
|
|
|
|
by jcranmer
1386 days ago
|
|
https://www.agner.org/optimize/instruction_tables.pdf, search for MXCSR (LDMXCSR and STMXCSR instructions). Keep in mind that twiddling these flags is going to require saving the MXCSR register to memory, or'ing or and'ing bits in memory, and then reading that memory back into MXCSR. And both saving and reading the MXCSR requires stalls, because floating point operations both read and write that register. So you require, minimum, 4 L1 cache hits and 2 partial pipeline flushes to twiddle a MXCSR bit. (As far as I'm aware, modern microarchitectures generally don't register-rename the floating-point status register.) |
|
Note that you wouldn't necessarily need to do a read-modify-write -- it'd suffice in most cases to just to save the old value and then reset the whole MXCSR for the scope requiring special treatment.