Hacker News new | ask | show | jobs
by KMag 2243 days ago
It's also a hindrance to in-order implementations that have a different number of branch delay cycles (e.g. different number of pipeline stages or instructions taking a variable number of cycles) than the original implementation.

Branch delay slots were a somewhat clever solution to reduce the complexity of the original implementation, but they baked implementation details into the ISA and became problematic when the implementation details changed.

1 comments

> they baked implementation details into the ISA and became problematic when the implementation details changed.

Same reason why stuff like VLIW has failed to catch on. These things are so dependent on specific hardware implementation details that one can hardly call them general-purpose ISA's anymore.

Moderns GPUs are VLIW machines.
No modern GPUs use VLIW. Ati/AMD switched from VILW to RISC-SIMD 8-9 years ago, NVIDIA a few years before that. Mobile phone GPUs gave up VILW for RISC too in the last 5 years or so.
DSPs as well.
And neither can be used as a compilation target for, say, Firefox, (or simpler, nethack) can they ?
Analog Devices provides a C/C++ compiler and a RTOS for SHARC, so I wouldn't be surprised if nethack could be compiled for the SHARC VLIW architecture (and its two branch delay slots).
well I stand corrected, it looks like there's even a fopen and fprintf in there so that would make a lot of things possible. I wonder about the performance for branch-heavy, non-vector-math computations on these CPUs.