Hacker News new | ask | show | jobs
by lqdc13 3615 days ago
Why not just use Xeon 2697-v2 for the same price as the phi?

It's 12 core so performance in all-core situation would be about the same as this one. But on non-parallelized code it would be ~5x faster..

1 comments

Memory bandwidth is important too. The Knights Landing processors have a 16GB on-chip memory to the cores have significantly higher bandwidth than you'd get with DDR4; the additional memory bandwidth makes more of an impact on the runtime of some algorithms than raw compute performance does.
The optional 16GB L3 is on separate chips, but it's colocated inside the same chip package. This kind of MCMs (multi-chip modules) have been used for a long time in the semiconductor industry since the 70s. Recent examples include AMD Xenos in XBox 360, Wii U CPU, IBM POWER chips.
Nope. First, it's not L3 cache and secondly comparing 3D-stacked in-package memory (MCDRAM, HBM, HBM2) with your examples is misleading. https://en.wikipedia.org/wiki/High_Bandwidth_Memory
You can configure the near memory to be used as cache or directly addressed memory as desired. Users of existing codes will configure it as cache.
Direct addressing is the preferred configuration. Only if your existing code's working set does not fit in MCDRAM does the cache configuration make sense.

It might sound pedantic on my part, but 'it can act as cache' is very different in practice from 'It is a cache'.