| I enjoy comments such as this because I enjoy putting a thought, an idea in my head and rolling it around, seeing where it goes and what problems it has. This reply may be a bit rambling as a result. If the FPGA on the bus is doing concurrent GC, we'd need a way to mark pointers and integers reliably. It means breaking all C code that does pointer manipulation, and breaking custom tagged pointer implementations. Niche languages and applications wouldn't have the buy in, and justifiably so, to set global policy on memory. On a developer machine you could experiment, but on a consumer device you would need to use whatever the OS and the bulk of applications have decided on. Maybe that's not a bad thing though? The OS would need to provide implementations of malloc and free, and other primitives, that are tied to the hardware. But I suppose that's not unreasonable. Except when it comes to virtualization. The FPGA isn't fixed function, but on the timescales that modern VMMs provide slices of CPU time to virtual machines, reprogramming an FPGA is an eternity. So we would gain in potential performance but lose in flexibility, substantially. The biggest problem would be legacy C code. Code that treats "void*" and "long" as interchangeable values. And that includes a lot of hand-written JITs, where coercing values between pointers and word-sized integers is prevalent. Rambling aside on potential upsides and downsides: The world would probably be a better place if pointers and integers were separated and strongly typed down to the hardware level. It would be a pain, but if the null pointer is a billion dollar mistake, placing all bets on the Princeton architecture may yet be a trillion dollar mistake this century. |
FPGA-aware garbage collection in Java (2005) https://buytaert.net/files/fpl05-paper.pdf
(Modified Jikes VM to use FPGA coprocessor for collection. Result was good performance with around 2.32% overhead on memory-intensive benchmarks.)
Fine-Grained Parallel Compacting Garbage Collection through Hardware-Supported Synchronization (2010) http://www.ikr.uni-stuttgart.de/Content/Publications/Archive...
Stall-free, real-time collector for FPGA's (2012) http://researcher.watson.ibm.com/researcher/files/us-bacon/B...
(This is on one chip with tiny resources for GC around 1% and 4-17% overhead.)
Maybe I can get something better for you after some rest. Note that my comment to the other person has more details of where I was going with this.
https://news.ycombinator.com/item?id=10150480