Hacker News new | ask | show | jobs
by cwzwarich 3949 days ago
Two factors require any GC support to be behind the CPU's MMU:

1) GC itself operates on virtual addresses.

2) If you want concurrent collection you're probably going to need a read barrier, and that will require some GC / MMU interaction.

The Azul Vega had a lot of interesting features to support GC (and other Java constructs), but the most important by far is the HW read barrier.

2 comments

1. I expect, if we use standard OS, that a modification might be needed where the FPGA will know the real address. The FPGA might be given enough info to figure it out with DMA or it might be something like a custom, syscall that gives FPGA details.

2. Barriers are one method. They're among the most expensive. An IBM prototype used careful lock management (supported by lock registers) and scheduling to avoid barriers. There's actually quite a few ways in the literature to avoid barriers in concurrency. I figure a team would have to leverage them plus asynchronous I/O from CPU to FPGA for optimal performance. I see it as a series of hand-offs to FPGA which, once it has necessary information, acts on those hand-offs with CPU assuming it completed after certain time, seeing a part of memory saying so, or receiving an interrupt.

That's a rough sketch.

The virtual addressing part might be fine now that the GPUs are doing it, and hypervisors support programmable passthrough using the x86 IOMMU (VT-d etc) features.

Though I'm convinced custom hardware is doomed here for the usual custom hardware reasons. Maybe GPUs have gotten good enough at pointer chasing to be usable here?

Custom hardware is actually flourishing in datacenters. My preferred architecture is Cavium Octeon III style of many-core RISC, accelerators for everything, plenty I/O, and hypervisor support. Selling like hotcakes. Adapteva's stuff outperforms CPU's & GPU's at performance-per-watt-per-area with sales to HPC people. There's similarly at least a few custom hardware companies in each segment doing something that's hard or not cost-effective with existing hardware or software.

I agree that the risk is high, though, to the point that one shouldn't depend on it. So, I'd advice selling system w/ services that's profitable which just happens to use such custom hardware. A high-performance, easy-to-manage, easy-to-integrate... already worth buying... platform that also has hardware-supported GC and/or memory safety. The sales of the system & licensing of the software subsidize hardware costs, which are structured to be cheap anyway. Start with FPGA's, then S-ASIC's, then advanced S-ASIC's or finally ASIC's. The NRE stays as low as volume can support.

Relevant example of this model (and evidence for my GC idea) is Azul Systems Vega machines. Those are custom hardware for Java supporting native bytecodes, a bunch of RAM, a pauseless GC, and easy enterprise integration. So, while we're all speculating, they're selling custom hardware w/ pauseless GC's. I'm just trying to work out a different, cheaper design hopefully integrating with Intel/AMD.

http://www.azulsystems.com/products/vega/overview

Note that they support a whole range of hardware, software, and services to diversify income. Any one thing shouldn't sink them, esp unfavorable hardware. That's the model to copy.

> Maybe GPUs have gotten good enough at pointer chasing to be usable here?

They haven't.