The bigger point is that there was no reason to kill X399 support at all. It's the same physical socket, the socket is capable of supporting much more than Threadripper did with it (Epyc uses the same socket for 8 memory channels), and the power consumption has not increased significantly compared to TR 2000 series.
There was no reason to kill TR4. It could have been a "legacy" board with PCIe 3.0 support, like X470 is for the desktop socket.
AMD just killed TR4 because they wanted everyone to buy new boards. The classic Intel move.
(meanwhile Intel put a new generation of chips on X299, while also putting out a compatible X299X socket that increases lane count. Intel doing it right for once, AMD doing it wrong for once.)
Which is kinda unnecessary as there is no single GPU on the market capable of saturating PCIe3 and situations where one needs a sustained transfer between multiple M.2 SSDs that could saturate PCIe4 are very rare. Only 100Gbps+ LAN is probably practical for a few total pro users.
Actually, its pretty easy to get bandwidth-bottlenecked in GPU-compute.
I know video games don't really get bandwidth bottlenecked, but all you gotta do is perform a "Scan" or "Reduce" on the GPU and bam, you're PCIe bottlenecked. (I recommend NVidia CUB or AMD ROCprim for these kinds of operations)
I pushed 1GB of data to device-side reduce the other day (just playing with ROCprim), and it took ~100ms to hipMemcpy the 1GB of data to the GPU, but only 5ms to actually execute the reduce. That's a PCIe-bottleneck for sure. (Numbers from memory... I don't quite remember them exactly but that was roughly the magnitudes we're talking about). That was over PCIe 3.0 x16, which seems to only push 10GBps one-way in practice. (15Gbps in theory, but practice is always lower than the specs)
Yeah, I know CPU / GPU have like 10us of latency, but you can easily write a "server" kind of CPU-master / GPU-slave scheduling algorithm to send these jobs down to the GPU. So you can write software to ignore the latency problem in many cases.
Software can't solve the bandwidth problem however. You gotta just buy a bigger pipe.
There was no reason to kill TR4. It could have been a "legacy" board with PCIe 3.0 support, like X470 is for the desktop socket.
AMD just killed TR4 because they wanted everyone to buy new boards. The classic Intel move.
(meanwhile Intel put a new generation of chips on X299, while also putting out a compatible X299X socket that increases lane count. Intel doing it right for once, AMD doing it wrong for once.)