| HN Mirror

> It's worth mentioning that they're writing for DSP systems.

Yes.

> More mainstream CPUs have a larger instruction cache (and smarter eviction policies), but the size of the hot code paths are (I imagine; I'm not terribly familiar with DSP) likely not much bigger, so that the problems they mention need solving less frequently for typical applications.

I'd have assumed the exact opposite. I assume that the size of the "hot enough" spots grows at least somewhat with the size of the binary. (It probably grows much slower than linear.) DSP binaries are smaller, so I'd assume that their hot spots are smaller even given "equivalent" programs.

But, DSP programs are different. Their data is more "regular" and their instruction sets are specialized for their typical uses. I'd expect better instruction cache behavior from DSP programs.

Fun fact - access times and feature sizes are related so if you maximize clock rates, the maximum size of a one cycle cache is basically independent of the access time. I suspect that almost all implementations can afford such a cache.

I suspect that the same rule applies to L2 caches but it's not clear that every implementation can afford the area. However, I suspect that implementations that have an L3 have maximized their L2.

DSP implementations are more likely to have traded cache size for dollars, so I expect that their caches are somewhat less effective on the same programs.

These two factors have opposing effects.