|
|
|
|
|
by jnordwick
594 days ago
|
|
I thought the Branch Target Predictor on x64 was global, not local, and it has to kick in before decode so even direct branches can be mispredicted. Branch prediction is 2 parts - the conditional predictor and the target predictor. The conditional predictor is actually per 64 byte instruction block (so if you have a few branches consecutively they share branch predictor entries and can step on each other. the target predictor uses a global history and needs to happen very early to keep the front end fed. |
|
All of the interaction between a million caches, predictor, instruction parallelism, different cpus, different code etc. feels like it is impossible to reason about it