|
|
|
|
|
by tptacek
5416 days ago
|
|
The parallels to high end network security are again interesting here; for instance, on the more interesting network processors, memory accesses were asynchronous, and instead of a "cache" you had a manually addressed "scratch" memory with latency comparable to registers. ... because "general purpose cache" wasn't helpful, but "exploitable locality" definitely was. |
|
In later GPUs they (mainly due to programmer laziness :-) ) switched to a more GPU-like automatically managed general purpose cache. The problem was that in most cases, being able to quickly write code that is reasonably fast was more important to developers than being able to squish out every cycle (though I believe it's still possible to disable the cache and manage it manually...)
The problem with automatic cache is the same as all "intelligent" CPU features, the CPU tries to predict what the program is doing, the programmer has to predict what the CPU will predict to optimize for that CPU. In the next generation of the CPU, the CPU vendor will try to optimize again for programs currently around. In the end, optimizing becomes much more complex.
There's certainly an advantage to keeping the logic "dumb" and simply having the software manage everything.