Hacker News new | ask | show | jobs
by ctw 3120 days ago
In my head, CPU's have few cores (1-16) but those cores are very highly optimized and have lots of specialized hardware in them. In comparison, GPU's have many cores (thousands) that are comparatively dumber, but that's fine because we use GPU's when we want to run lots of simple operations in parallel, like image processing, etc.

With that in mind, where does this chip fit into the current space? Is it meant to replace both? It contains their two different kinds of cores, which I'm assuming are similar to the complex cores we have in CPU's and the simple cores we have in GPU's. Does this mean that if chips like this are used in the future, we wouldn't have separate processing units?

Also, how does one use an SoC like this? What are its inputs and outputs? How do I access those? Do I need specialized hardware? Can I plug it into my existing desktop? Do they expect a new system to be built around this, or is it a drop in replacement for a part in an existing system?

4 comments

Once you get above a few cores, the only way to feasibly utilize them is to have data parallelism. And when you need to exploit data parallelism, programming models change and data bandwidth (both in interconnect between CPUs and to/from memory) becomes the limiting factor rather than compute bandwidth.

The only way to fit in a large number of cores on a die is to slim down the cores to remove space-hungry stuff (e.g., out-of-order). This makes the cores weak for single-threaded stuff, which means you're generally firmly in the HPC market and not suitable for personal use. It also requires pretty much developing different programming models, at which point the value-add compared to, say, a GPU seems hard to find. Note that one of the key points of the GPU model is that it oversubscribes the processors with work and swaps threads in and out while they're waiting for memory to complete, something like 16-way SMT.

I'll note that Intel did build something like this (the Xeon Phi), but they appear to be dropping it.

The GPU has many cores but they are organised in groups that share the instruction cache and decode logic, so they all execute the same program in lock-step (AMD call these 'wavefronts' and NVIDIA call this 'warps').

A many-core RISC-V has many independent cores, all with running their own cache and decode logic so all able to run different code and stall independently of each other.

There are problems beyond graphics where warps or wavefronts are great matches, and for those problems GPGPUs are very effective. But there are also problems where, even if each core is basically running the same program, cores can diverge and for that you want cores with their own decoders so they can stall or branch independently.

Its all a tradeoff.

>. In comparison, GPU's have many cores (thousands) that are comparatively dumber, but that's fine because we use GPU's when we want to run lots of simple operations in parallel, like image processing, etc.

They don't have that many cores. Look at AMD's Vega 64. It's right in the name. It has 64 cores. Each core is very dumb but also very big with 64 ALUs.

And in the same vein, does this encroach on some use cases for FPGA?
Surely even with parallel programming being notoriously hard to get right it's most probably easier than FPGA programming.