| "If it hits, it might yield big performance boost in real world apps. If it misses, it's just a little wasted electricity." Caches have limited size. If it misses, it also evicts something else from the cache. If that is what is actually needed, this costs performance. "and are pretty close to the original cache line's (virtual) address" Why does it have to be 'pretty close'? "Now if these values happen to be sane virtual addresses in the current process" That sanity check would involve visiting the paging tables, so it would require at least two indirections (http://lwn.net/Articles/253361/). If a cache line is 16 bytes, you would have at least 4 positions where a 32-bit pointer could be present. So, at least four times two memory lookups would be needed. I think all of them would go through the same cache, but even assuming that the CPU has ways of signalling that it should not recurse, I do not think it is practical to do what you describe (disclaimer: I am not an expert on CPU design) What is possible is to guess at where data is to be found. That allows CPUs to read and speculatively execute instructions from the physical memory that they think backs the virtual address of the PC while they, in parallel, do the lookup to verify that. See http://dl.acm.org/citation.cfm?id=2000101&dl=ACM&col.... I do not know whether this has made it into actual CPUs, though. |