|
|
|
|
|
by repstosb
115 days ago
|
|
The problem is that C++ stores the vtable inside the object, and the objects over which you're iterating often weren't allocated contiguously. Even when they are, if each object contains lots of other data, the vtables won't necessarily be close to each other. That means that invoking virtual functions inside a loop means a lot of cache misses, and since the data you're fetching will be a branch target, it's often hard to find other useful work to accomplish during the memory delay cycles. However, in a language where you can store a relatively tight array of object IDs (or even use tag bits in the this pointer), now you have a much higher cache hit rate on the indexes to your equally tight dispatch table, which will also have a high hit rate. It's a fair amount of extra work, but in a hot loop it's sometimes worth it. "You can often solve correctness problems (tricky corner cases) by adding an extra layer of indirection. You can solve any performance problem by removing a layer of indirection." |
|