Hacker News new | ask | show | jobs
by dragontamer 3187 days ago
AMD already does "Zero Copy transfers" (the on-chip cache!!) with its "Fusion" APUs (ex: A10-7850K) for CPU <<---->> GPU.

I'm not really seeing loads of people taking advantage of the feature however. The platform is cheap, the technology is available but its just way too weird an architecture to become mainstream.

There are numerous benefits: the CPU can create a linked list or graph, and the memory will still be valid on the GPU. CPU / GPU atomics are unified, and GPUs can even call CPU functions under AMD's HSA platform.

* https://images.anandtech.com/doci/7677/20%20-%20HSA%20Use%20...

* http://developer.amd.com/wordpress/media/2012/10/hsa10.pdf

* https://www.anandtech.com/show/7677/amd-kaveri-review-a8-760...

---------

I think Intel had a similar technology implemented on their "Crystalwell" chips, which were basically an L4 cache which provided a high-bandwidth link between the CPU and GPU (although not quite as flexible).

No, its not an FPGA, but OpenCL / GPGPU compute seems to be a bit more mainstream than FPGA compute at the moment. I haven't seen too much excitement in general for this feature however.

1 comments

>I'm not really seeing loads of people taking advantage of the feature however. The platform is cheap, the technology is available but its just way too weird an architecture to become mainstream.

AMD sells a consumer product. For most consumers even a smartphone offers enough CPU and GPU performance. The content producers who care about performance usually buy the best CPU and GPU. HSA isn't available on AMD's Ryzen or Threadripper processors.

Intel is trying to sell to datacenters where performance or energy efficiency is a major selling point.

> The content producers who care about performance usually buy the best CPU and GPU. HSA isn't available on AMD's Ryzen or Threadripper processors.

Raven Ridge will be based on Zen CPU cores and Vega GPU cores. But naturally, Raven Ridge will be slower than Threadripper because the GPU will take up some space (that otherwise would have been additional CPU cores).

Rumored specs of Raven Ridge APU is 4 CPU cores and 11 GPU Compute Units. In contrast, Threadripper is around 16 CPU Cores and Vega 64 is 64 GPU cores, separated by a PCIe x16.

So basically, its the price you pay for sticking so many things onto a single package. There are thermal limits, as well as manufacturing limits (ie: practical yield sizes) to how large these chips can be.

If you want the best of both worlds, like an EPYC CPU with Vega 64 or a high-end NVidia Pascal / Volta chip, you'll need to buy a dedicated GPU and a dedicated CPU. True, a hybrid chip like Raven Ridge (or any of the AMD HSA stuff) has benefits with regards to communication, but the penalty to CPU speed and/or GPU speed seems to be huge.

-----------

I personally expect that if any "mainstream" FPGA solutions come out, they'll be connected to the PCIe and not merged into the CPU. There seems to be just too many heat and manufacturing issues to make a merged product compared to the standard PCIe x16, which is quite fast.

Alternatively, certain tasks (like Cryptography) can be accelerated using dedicated instructions, like the Intel AES-NI instruction set. Or Intel's Quicksync H.264 encoding solution. Fully Dedicated chipspace (like AES-NI) is way faster and more power efficient than FPGAs after all.