|
|
|
|
|
by berkut
4773 days ago
|
|
> BTW, have you heard that the next Haswell processors can get only 5% performance improvement? Bollocks - using the new gather AVX instructions, I've seen close to a 40% increase over IB on some floating-point code I've hand-written with intrinsics. Existing C++ code is around 13-16% faster thanks to better cache bandwidth and a huge L4 cache. Turn on FMA (fused multiply–add) optimisation and that goes to ~20% faster. |
|