Doesn't avx-512 crush the clock. Then they still spec using the non avx-512 clock speeds for floating point somehow. I always felt there was game playing here.
That's something I addressed, perhaps not clearly enough.
Having 2x 512b units crushes the clocks. That is a factor of vector width (and having multiple units), not the instruction set itself. Half-rate AVX-512 would in all probability run at the same speeds as AVX2, and allows significantly more flexibility than "mere" 256-bit AVX2.
That's perhaps something that is not clear from Linus - whether he is railing against the instruction set, or the implementation in Skylake-SP, or both (I think it's both). I would probably agree that Skylake-SP takes it too far with dual AVX-512 units, but the instruction set itself ("stop inventing magic instructions"), yeah he's way off base.
I am really hoping that one of these Zen generations, AMD includes half-rate AVX-512 like they had half-rate AVX2 in Zen1/Zen+. That really seems like the best of both worlds, you get the flexibility of AVX-512 but without the die area or the power issues. If not, well, Rocket Lake is coming end of this year/early next year and that should bring AVX-512 to consumer platform desktop (as it's in Tiger Lake and Rocket Lake will be a Tiger Lake backport) and will probably have only a single AVX-512 unit since it's mobile-derived.
Do you / Intel GUARANTEE no frequency scaling to handle 512?
Intel quotes these specs on throughput and seem to do it using full frequency for FP. But if you have 512 in your mix (ie, 1-2% 512) do you really get that speed. Cloudflare talks about frequency dropping to 1.4Ghz
Having 2x 512b units crushes the clocks. That is a factor of vector width (and having multiple units), not the instruction set itself. Half-rate AVX-512 would in all probability run at the same speeds as AVX2, and allows significantly more flexibility than "mere" 256-bit AVX2.
That's perhaps something that is not clear from Linus - whether he is railing against the instruction set, or the implementation in Skylake-SP, or both (I think it's both). I would probably agree that Skylake-SP takes it too far with dual AVX-512 units, but the instruction set itself ("stop inventing magic instructions"), yeah he's way off base.
I am really hoping that one of these Zen generations, AMD includes half-rate AVX-512 like they had half-rate AVX2 in Zen1/Zen+. That really seems like the best of both worlds, you get the flexibility of AVX-512 but without the die area or the power issues. If not, well, Rocket Lake is coming end of this year/early next year and that should bring AVX-512 to consumer platform desktop (as it's in Tiger Lake and Rocket Lake will be a Tiger Lake backport) and will probably have only a single AVX-512 unit since it's mobile-derived.