| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by jauntywundrkind 917 days ago

I don't disagree anywhere but I don't think any of these statements actually condemn Xeon Phi outright. It didn't work at the time, and doing it with so little software support to tile out workloads well was a big & possibly bad gambit, but I'm so unsure we can condemn the architecture. There seems to be so few folks who made good attempts and succeeded or failed & wrote about it.

I tend to think there was tons of untapped potential still on the table. And that a failure to adopt potential isn't purely Intel alone's fault. The story we are commenting on is about the rest-of-industry trying to figure out enduring joint strategies, and much of this is chipmaker provided, but it is also informed and helped by plenty of consumers also pouring energy in to figure out what's working and not, trying to push the bounds.

Agreed that anyone going in thinking Xeon Phi would be viable for running a boring everyday x86 workload was going to be sad. To me the promise seemed clear that existing toolchains & code would work, but it was always clear to me there were a bunch of little punycores & massive SIMD units and that doing anything not SIMD intensive wasn't going to go well at all. But what's the current trend? Intel and AMD are both actively building not punycores but smaller cores, with Sierra Forest and Bergamo. E-cores are the grown up Atom we saw here.

Yes the GPGPU folks were winning. They had a huge head start, were the default option. And Intel was having trouble delivering nodes. So yes, Xeon Phi was getting trounced for real reasons. But they weren't architectural issues! It just means the Xeon Phi premise was becoming increasingly handicapped.

As I said I broadly agree everywhere. Your core point about giving the market more of what it already does is well taken, is a river of wisdom we see again and again. But I do think conservative thinking, iterating along, is dangerous thinking that obstructs us from seeing real value & possibility before us. Maybe Intel could have made a better ML chip than the GPGPU market has gotten for years, had things gone differently; I think the industry could perhaps have been glad they had veered onto a new course, but the barriers to that happening & the slow down in Intel delivery & the difficulty bootstrapping new software were all horrible encumberances which were rightly more than was worth bearing together.

1 comments

vkazanov 917 days ago

I don't thing anybody seriously considered Phi's for generic compute or something.

Most experimenters saw it as a way to have something GPU-like in terms of raw power but with no limitations charateristic of SIMT's. Like, slightly different code paths for threads doing number crunching or something.

But it turns out that it's easier to force everything into a matrix. Or a very big matrix. Or a very-very-very big matrix.

And then see what sticks.

link

janwas 916 days ago

Why are we not also talking about memory bandwidth? Personal opinion: this is the key. The latest Phi had about 100 GB/s in 2017. The contemporary Nvidia GTX 1080: 320 GB/s.

When CPUs actually come with bandwidth and a decent vector unit, such as the A64FX, lo and behold, they lead the Top500 supercomputer list, also beating out GPUs of the day.

Why have we not been getting bandwidth in CPUs? Is it because SPECint benchmarks do not use much? Or because there is too much branch-heavy code, so we think hundreds of cores are helpful?

Existing machines are ridiculously imbalanced, hundreds of times more compute vs bandwidth than the 1:1 still seen in the 90s. Hence matmul as a way of using/wasting the extra compute.

The AMD MI300a looks like a very interesting development: >5 TB/s shared by 24 cores plus GPUs.

link